Mounting table from a csv, grouped by week, with python and pandas

Asked

Viewed 2,323 times

2

I am using pandas and open the following table using the code

tst = pd.read_csv('Iteracao.csv',delimiter=",")

inserir a descrição da imagem aqui

I’m trying to group as follows, where week 1 is the week of the date 2017-02-10/2017-02-16, following the logic for the other week, the Dice does not need to be week, it may be the date but I’m trying to group like this.

inserir a descrição da imagem aqui

My attempt was to using the following logic, I saved in a column

semana = df['Semana'].dt.to_period('W-THU')
semana = np.unique(semana)

and caught only once each value then I would try to traverse a iterrows for check if the week(i) equals the row of the column Row['Week'] dps try to save in another table.

for index, row in tst.iterrows():
    for i in range(s):      
        n = semana[i]
        if row['Semana'] == n :
            print('1')
            tp['Nome']=row['Nome completo']
            tp['count']=row['count']
            print(tp)

the answer doesn’t have to be code but only logic for me to follow. Thank you

  • Put the data instead of images, it’s easier for those who want to try to help

1 answer

1


TL;DR
I noticed that you already applied that response, and exported the resulting table to a csv And now you want to set up the final table from reading it, right? Since Voce left images and not data on this question, I will use the data of that answer, (exported to the csv by name test1.csv) to present the solution. I swear I tried to answer your request not to put the code but an explanation, but it came in the end not even I could understand and, I was implementing to see and I ended up doing everything different :-). Then I will answer with code.

File reading:

import io
import pandas as pd
from collections import OrderedDict

# Lendo o arquivo para uma lista de linhas
f = open('test1.csv', mode='r')
lines = f.readlines()

# Exculuindo o cabeçalho
del lines[0]

Initial result:

lines
['2017-03-03/2017-03-09,Filho,8\n',
 '2017-03-03/2017-03-09,Gabriel,2\n',
 '2017-03-03/2017-03-09,Jao,10\n',
 '2017-03-03/2017-03-09,Otavio,6\n',
 '2017-03-03/2017-03-09,Talita,9\n',
 '2017-03-10/2017-03-16,Guilherme,1\n',
 '2017-03-10/2017-03-16,Talita,7\n']

Creating the dictionary and auxiliary variables

semanas, nomes = [], []
d1 = OrderedDict()
for l in lines:
    line = l.rstrip().split(',')
    if line[0] not in semanas:
        semanas.append(line[0])
        d1[line[0]] = {}
    d1[line[0]][line[1]] = line[2]

nomes=[]
[nomes.append(k) for v in d1.values() for k in v.keys() if k not in nomes] 

Tabulating data to feed the dataframe

data = [] 
for nome in nomes:
    n = []
    for semana in semanas:
        if nome in d1[semana]:
            n.append(d1[semana][nome])
        else:
            n.append(0)
    data.append(n) 

Intermediary Statement (dictionary, names, weeks and data):

d1
{'2017-03-03/2017-03-09': {'Filho': '8',
  'Gabriel': '2',
  'Jao': '10',
  'Otavio': '6',
  'Talita': '9'},
 '2017-03-10/2017-03-16': {'Guilherme': '1', 'Talita': '7'}}

nomes
['Filho', 'Gabriel', 'Jao', 'Otavio', 'Talita', 'Guilherme']

semanas
['2017-03-03/2017-03-09', '2017-03-10/2017-03-16']

data
[['8', 0], ['2', 0], ['10', 0], ['6', 0], ['9', '7'], [0, '1']]

Building and formatting the final table.

columns = []
for i in range(len(semanas)):
    columns.append('Week'+str(i))    
df = pd.DataFrame(data, index=nomes, columns=columns)

Final result (image):

Resultado final

If you want to name the index, generally I do not do because it creates a blank line only for the appointment, instead of cluttering the blank space of the index column, I have not yet achieved this.

View the code execution on a Jupyter Notebook, here.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.