4
good afternoon.
I don’t have much skill with Python, I’m having some doubts. Anyone who can help me, I thank you.
I opened my csv file in python as follows:
import pandas as pd
caminhoArquivo = r'\\Desktop\Base\dias.csv'
baseDados = pd.read_csv(caminhoArquivo,sep=';',decimal=',',encoding='latin-1')
File Example:
Index  |  Nome  |  Dia
  0    | Pedro  |   3
  1    | Pedro  |   3
  2    | Pedro  |   24
  3    | Antonio|   24
  4    | Antonio|   24
  5    | Antonio|   24
  6    | Carlos |   4
  7    | Carlos |   4
  8    | Carlos |   28
  9    |  Jose  |   1
  10   |  Jose  |   2
  11   |  Jose  |   2
I removed duplicate data using the command:
colunas = ['Nome','Dia']
diaDuplicado = baseDados.drop_duplicates(subset = colunas)
diaDuplicado = diaDuplicado.reset_index()
So, it became:
 Index |  index  |  Nome  |  Dia
  0    |    0    | Pedro  |   3
  1    |    2    | Pedro  |   24
  2    |    3    | Antonio|   24
  3    |    6    | Carlos |   4
  4    |    8    | Carlos |   28
  5    |    9    |  Jose  |   1
  6    |    10   |  Jose  |   2
Now for my doubt. I needed to group the days by names, to stay this way:
Index |  Nome  |  Dia
  0   | Pedro  |   3, 24
  1   | Antonio|   24
  2   | Carlos |   4, 28
  3   |  Jose  |   1, 2
But the only solution I could find was:
diasgroup = diaDuplicado.groupby(by=['Nome'])['Dia'].apply(list)
But in this way it transforms the "Name" column into Dice and is in a format/Type "object".
Index  |  Dia
Pedro  |  3, 24
Antonio|  24
Carlos |  4, 28   
 Jose  |  1, 2
Someone could help me?

Use a
diasgroup.reset_index()would not work?– AlexCiuffa