4
good afternoon.
I don’t have much skill with Python, I’m having some doubts. Anyone who can help me, I thank you.
I opened my csv file in python as follows:
import pandas as pd
caminhoArquivo = r'\\Desktop\Base\dias.csv'
baseDados = pd.read_csv(caminhoArquivo,sep=';',decimal=',',encoding='latin-1')
File Example:
Index | Nome | Dia
0 | Pedro | 3
1 | Pedro | 3
2 | Pedro | 24
3 | Antonio| 24
4 | Antonio| 24
5 | Antonio| 24
6 | Carlos | 4
7 | Carlos | 4
8 | Carlos | 28
9 | Jose | 1
10 | Jose | 2
11 | Jose | 2
I removed duplicate data using the command:
colunas = ['Nome','Dia']
diaDuplicado = baseDados.drop_duplicates(subset = colunas)
diaDuplicado = diaDuplicado.reset_index()
So, it became:
Index | index | Nome | Dia
0 | 0 | Pedro | 3
1 | 2 | Pedro | 24
2 | 3 | Antonio| 24
3 | 6 | Carlos | 4
4 | 8 | Carlos | 28
5 | 9 | Jose | 1
6 | 10 | Jose | 2
Now for my doubt. I needed to group the days by names, to stay this way:
Index | Nome | Dia
0 | Pedro | 3, 24
1 | Antonio| 24
2 | Carlos | 4, 28
3 | Jose | 1, 2
But the only solution I could find was:
diasgroup = diaDuplicado.groupby(by=['Nome'])['Dia'].apply(list)
But in this way it transforms the "Name" column into Dice and is in a format/Type "object".
Index | Dia
Pedro | 3, 24
Antonio| 24
Carlos | 4, 28
Jose | 1, 2
Someone could help me?
Use a
diasgroup.reset_index()
would not work?– AlexCiuffa