2
Considering two sets of read data from type files *.CSV
with the Pandas
. Each set has only one field CPF Favorecido
,where there are millions of records. Each data set is equivalent to one month.
I need to figure out which records (CPF numbers) are in one dataset but not in another.
The code is like this:
atual = pandas.read_csv(arquivo_atual, header=0, delimiter='\t', quotechar='"', usecols=['CPF Favorecido'])
seguinte = pandas.read_csv(arquivo_seguinte, header=0, delimiter='\t', quotechar='"', usecols=['CPF Favorecido'])
I just need the count of the numbers that appear in the file atual
but they’re not in the archive seguinte
and vice versa.
Is there a function that counts these records? Or do I need to build one loop and compare one to one?
Dude, I wasn’t hitting the syntax of this command! I tried using "isin" several times and gave error. Muto grateful, problem solved!
– Sandro