1
I am converting 2 dataframes into 2 lists. And I will group them by the CNPJ_UF_MES column (always the 1st column of each dataframe. In the real example, I will join more than 2 dataframes.
After grouping these lists, I will have lists of lists. and need to unite them based on the key column.
import pandas as pd
df1 = pd.DataFrame({
'CNPJ_UF_MES': ['1-MG', '1-MG', '2-MG'],
'CLIENTE_PAR_COMEX': ['1_2_3', '2_3_4', '3_4_5'],
'REG': ['1110', '1110', '1110'],
'COD_IP_PAR': ['200', '200', '200'],
'COD_CLIENTE': ['xxr', 'xxv', 'xxw'],
'IND_COMEX': ['wer', 'cad', 'sder'],
'IND_EXTEMP': ['key_1', 'key_2', 'key_3'],
'DT_INI': ['01032021', '01032021', '01032021'],
'DT_FIN': ['31032021', '31032021', '31032021'],
'VALOR': ['wer', 'cad', 'sder'],
'PIPE_FIN': ['', '', '']
})
df2 = pd.DataFrame({
'CNPJ_UF_MES': ['1-MG', '1-MG', '1-MG', '1-MG', '2-MG', '2-MG'],
'CLIENTE_PAR_COMEX': ['1_2_3', '2_3_4', '3_4_5','1_2_3', '2_3_4', '3_4_5'],
'REG': ['1110', '1110', '1110', '1110', '1110', '1110'],
'COD_MCAPT': ['C-200', 'C-200', 'C-200', 'C-200', 'C-200', 'C-200'],
'DT_OP': ['07032021', '07032021', '07032021', '08032021', '08032021', '08032021'],
'VALOR': ['1', '2', '3', '1', '2', '3'],
'QTD': ['2', '2', '2', '2', '2', '2'],
'CNPJ_ADQUI': ['123', '123', '123', '123', '123', '123'],
'PIPE_FIN': ['', '', '', '', '', '']
})
list1 = df1.values.tolist()
#print(list1)
values = set(map(lambda x:x[0], list1))
newlist1 = [[y for y in list1 if y[0]==x] for x in values]
print(newlist1)
list2 = df2.values.tolist()
#print(list2)
values = set(map(lambda x:x[0], list2))
newlist2 = [[y for y in list2 if y[0]==x] for x in values]
print(newlist2)
##Estou tendo problema aqui:
list3 = newlist1.append(newlist2)
print(list3)
I would like to have lists of lists grouped by the key column.
Any idea?
This is the desired output:
[[
['2-MG', '3_4_5', '1110', '200', 'xxw', 'sder', 'key_3', '01032021', '31032021', 'sder', ''],
['2-MG', '2_3_4', '1110', 'C-200', '08032021', '2', '2', '123', ''],
['2-MG', '2_3_4', '1110', 'C-200', '08032021', '2', '2', '123', '']
],
[
['1-MG', '1_2_3', '1110', '200', 'xxr', 'wer', 'key_1', '01032021', '31032021', 'wer', ''],
['1-MG', '2_3_4', '1110', '200', 'xxv', 'cad', 'key_2', '01032021', '31032021', 'cad', ''],
['1-MG', '1_2_3', '1110', 'C-200', '07032021', '1', '2', '123', ''],
['1-MG', '2_3_4', '1110', 'C-200', '07032021', '2', '2', '123', ''],
['1-MG', '3_4_5', '1110', 'C-200', '07032021', '3', '2', '123', ''],
['1-MG', '1_2_3', '1110', 'C-200', '08032021', '1', '2', '123', '']
]]
Later I will manipulate these lists and create CSV files, each file grouped by the key column "CNPF_UF_MES".
If they are the same fields, why don’t you concatenate the dataframes and then generate the lists?
– Paulo Marques
Thanks for the feedback. But the dfs columns are very different, and I need to convert to lists first.
– Tanai Goncalves