Cumulative sum per line

Asked

Viewed 1,645 times

0

Good afternoon colleagues I would like a help. In the code below i Gero a new column (accumulated) using cumsum. The result is a cumulative sum for each row. However I need to bring the accumulated by rows for each color column criteria. I tested it using the tuples, with if’s, but it got too slow to run millions of lines. Please help me with this.

Follows the code:

import pandas as pd

import numpy as np

df = pd.DataFrame({'cor': ['azul', 'preto', 'amarelo', 'azul', 'preto', 'amarelo', 'preto', 'azul', 'amarelo', 'azul', 'amarelo'],
                   'preco': [1,1,1,2,3,4,5,3,2,4,1]})

df2 = df.preco

df['acumulado'] = df2.cumsum()

df

##################################
Resultado: 
    cor   preco acumulado
0   azul    1   1
1   preto   1   2
2   amarelo 1   3
3   azul    2   5
4   preto   3   8
5   amarelo 4   12
6   preto   5   17
7   azul    3   20
8   amarelo 2   22
9   azul    4   26
10  amarelo 1   27

Thank you.

1 answer

1


To have the claim you want you have to call the cumsum() along with a function groupby()

The function groupby() performs a grouping according to the columns you choose and allows you to perform operations as average, sum and among others, in this case I used the cumsum()

import pandas as pd

import numpy as np

df = pd.DataFrame({'cor': ['azul', 'preto', 'amarelo', 'azul', 'preto', 'amarelo', 'preto', 'azul', 'amarelo', 'azul', 'amarelo'],
                   'preco': [1,1,1,2,3,4,5,3,2,4,1]})

df2 = df.preco

df['acumulado'] = df.groupby(['cor']).cumsum()

##Essas operações são feitas para melhorar o vizual do Data Frame
## 1 - Ordeno pela coluna cor e reseto os index do Data Frama
df = df.sort_values(['cor']).reset_index()

## 2 - Drop na coluna Index antiga
df = df.drop(['index'],axis=1)

##Imprime o Data Frame
print(df)

##Resultado do Data Frame Esperado
#############
        cor  preco  acumulado
0   amarelo      1     1
1   amarelo      4     5
2   amarelo      2     7
3   amarelo      1     8
4      azul      1     1
5      azul      2     3
6      azul      3     6
7      azul      4    10
8     preto      1     1
9     preto      3     4
10    preto      5     9

For more information there is this link in the OR with the same doubt.

Link of the doc’s groupby()

  • Our , was excellent. Thank you very much!!!!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.