0
I have two CSV, I want to compare more than one field using duplicated
. Is there a way, or can I just pass one parameter at a time?
I followed the direction of Clayton Tosatti and I got here, but now I’ve come across such doubt.
import pandas as pd
dados = pd.read_csv('gestantes_prenatal.csv')
dados2 = pd.read_csv('cidade_social.csv')
print(dados[['CNS','CNS','CPF','PIS','NASCIMENTO','NOME_DA_MAE']])
print(dados2[['NOME','CNS','CNS','CPF','PIS','NASCIMENTO','NOME_DA_MAE']])
df_aux = pd.concat([dados['CPF'],dados2['CPF']])
Right down to the last line, perfect. But, I wanted something like:
df_aux = pd.concat([dados['NOME','PIS','CPF'],dados2['NOME','PIS','CPF']])
df_aux[df_aux.duplicated()]
Generates this error:
KeyError Traceback (most recent call last)
/home/hudson/.local/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)<br/>
2896 try:<br/>
-> 2897 return self._engine.get_loc(key)<br/>
2898 except KeyError:<br/>
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: ('NOME', 'PIS', 'CPF')
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)<br/>
<ipython-input-10-5a532f25327c> in <module>()
----> 1 df_aux = pd.concat([dados['NOME','PIS','CPF'],dados2['NOME','PIS','CPF']])
/home/hudson/.local/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
2993 if self.columns.nlevels > 1:<br/>
2994 return self._getitem_multilevel(key)<br/>
-> 2995 indexer = self.columns.get_loc(key)<br/>
2996 if is_integer(indexer):<br/>
2997 indexer = [indexer]<br/>
/home/hudson/.local/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2897 return self._engine.get_loc(key)<br/>
2898 except KeyError:<br/>
-> 2899 return self._engine.get_loc(self._maybe_cast_indexer(key))<br/>
2900 indexer = self.get_indexer([key], method=method, tolerance=tolerance)<br/>
2901 if indexer.ndim > 1 or indexer.size > 1:<br/>
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: ('NOME', 'PIS', 'CPF')
I used these files.
What clasps don’t do heim?!!! Again, thank you very much Clayton Tosatti.
– Hudson Souza
rsrs they help a lot even, I don’t know if you got the concept, but basically, they serve to pass columns as lists, also should work using a variable(list) to store the columns before and then only pass in DF.
lista = ['NOME', 'PIS', 'CPF']
|df_aux = pd.concat([dados[lista],dados2[lista]])
– Clayton Tosatti
Perfect, very show indeed! Beautiful language.
– Hudson Souza