How to calculate the average for groups and identify the maximum value?

Asked

Viewed 55 times

2

I have a spreadsheet where I need to find the month with the highest sales average.

With the code below I managed, but when I print the object appears all the disordered information as can be seen in the image.

https://prnt.sc/xka518

I need to print the month with the highest billed amount for all employees (example: "August 2020 - Total Billed: 15,000"

The code I’m using.

dfseason = df.groupby(by='Month')['Billed'].sum().nlargest(1)
print(dfseason)

When printing the object I have the information as in the image

https://prnt.sc/xka7uw

2 answers

4

If I understood what you need, it is possible to solve using the method agg:

tab =df.groupby(by='Month').agg({'Billed':'sum'})
#ordenando a tabela
tab=tab.sort_values('Billed',ascending=False)
tab

Returns:

       Billed
Month        
May       918
Nov       885
Apr       735
Dec       694
Jul       634
Jun       542
Sep       539
Aug       438

The key value is therefore obtained by tab.iloc[0,:].

Printing the formatted results:

for k, v in tab.to_dict()['Billed'].items():
    print("{} 2020 - Total Billed {}".format(k,v))

Returns:

May 2020 - Total Billed 918
Nov 2020 - Total Billed 885
Apr 2020 - Total Billed 735
Dec 2020 - Total Billed 694
Jul 2020 - Total Billed 634
Jun 2020 - Total Billed 542
Sep 2020 - Total Billed 539
Aug 2020 - Total Billed 438

To print only the largest:

print("{} 2020 - Total Billed {}".format(tab.index[0],tab.Billed[0]))
May 2020 - Total Billed 918
  • Using your method the output comes out exactly like mine.

  • Well, then I don’t understand the question. What’s wrong with this exit? How would you like it to stay?

  • I need to manipulate this object and give an example print. May 2020 - Total Billed : 918

  • Got it now. I’ll edit the answer

  • It is returning eight equal values (since it is eight months with values). It would be possible to return only one value?

2


One way to return the values in an "ordered" way is to use the reset_index

dfseason = df.groupby(by='Month', sort=True)['Billed'].sum().nlargest(1).reset_index()
dfseason

Exit

    Month   Billed
0   May     918

To customize a phrase

print(f"Mês {dfseason.loc[0,'Month']} - Total faturado {dfseason.loc[0,'Billed']}")

Exit

Mês May - Total faturado 918

Browser other questions tagged

You are not signed in. Login or sign up in order to post.