Group three commands into one

Asked

Viewed 577 times

1

My dataframe is called f0219. The following command filters a column of this df called Typorubrica when the code is equal to 2; this filter is assigned to the variable f0219Teto; Then, another command groups the new dataframe created, f0219Teto, by Matricula and Nome, and sums the contents of the column "Value", assigning the result to the variable f0219somaTeto. Finally, Check if any salary exceeded the set ceiling. Are three commands to do this. They are working, and return exactly what I want. However, I would like to group these three commands into one. How can I do this?

f0219Teto = f0219[f0219.Tiporubrica == 2] # filtra por codigo igual a 2

f0219somaTeto = f0219Teto.groupby(['Matricula', 'Nome'])['Nome','Valor'].sum() # agrupa por matricula e soma

f0219somaTeto[f0219somaTeto.Valor>39200] # verifica se houve ganho acima do teto

Dataframe example I’m working on: https://i.stack.Imgur.com/Qlg8j.png

  • Could you put a small sample of the data to understand the structure? It doesn’t need to be exactly the same, but so that you can manipulate it.

  • I changed the original question with the example dataframe.

2 answers

1

Roger, I understand that it is impossible for you to group the three commands the way you want. That’s because from the moment you use groupby, it returns a new Dataframe (DF) with another constitution, making it impossible for you to filter (>39200) without allocating this DF into a new variable. What you could do is group the first 2 commands, assign a variable as f0219_grouped and then perform the filter. Example below:

Defining the Dataframe

import pandas as pd

data = {'Matricula': [1, 1, 1, 1, 2, 3], 
        'Nome': ['jose', 'jose', 'jose', 'jose', 'maria', 'pedro'], 
        'Tiporubrica': [2, 2, 2, 2, 1, 3],
        'Valor': [4749.33, 8411.01, 1472.29, 6208.33, 444.22, 1258.41]}

f0219 = pd.DataFrame(data)
f0219

#Saída
    Matricula   Nome    Tiporubrica     Valor
0           1   jose              2     4749.33
1           1   jose              2     8411.01
2           1   jose              2     1472.29
3           1   jose              2     6208.33
4           2   maria             1     444.22
5           3   pedro             3     1258.41

Creating the grouped DF

f0219_grouped = f0219[f0219.Tiporubrica >= 2].groupby(['Matricula', 'Nome'])['Nome','Valor'].sum()
f0219_grouped

#Saída  
Matricula   Nome    Valor   
1           jose    20840.96
3           pedro   1258.41

Apply the Fitro

f0219_grouped[f0219_grouped.Valor > 2000]

#Saída      
Matricula   Nome    Valor   
1           jose    20840.96

With this, you reduce 3 commands to 2, but there is no way to reduce to only 1 using pandas group_by.

  • Hello, you exemplified the first command populating new data. Is there an example with the existing dataframe, which I posted in the image? Grateful.

0


You can reduce the 3 commands in a single line by joining the first and third conditions within a (1)loc using (2)groupby with (3)transform, thus:

f0219.loc[(f0219.Tiporubrica == 2) & (f0219.groupby(['Matricula', 'Nome']) \
['Valor'].transform('sum') > 39200)].groupby(['Matricula', 'Nome']) \
['Valor'].sum().reset_index()

Browser other questions tagged

You are not signed in. Login or sign up in order to post.