Select the top 10 values of a dataframe variable in python?

Question

Asked 4 years, 8 months ago

Viewed 874 times

-2

I need to figure out the top 10 values of the variable qty sales together with the information of client name of a sales base.

To find out the names of the 10 customers who bought the most.

Making df['qtd_ordens'].max() I have the maximum sales value, but I need the top 10 along with the names of those customers.

1 answer

Browser other questions tagged python

You are not signed in. Login or sign up in order to post.

by AlexCiuffa • **2,402** points · Answer 1 · 2020-11-02T05:28:10+00:00

Pandas has the function .nlargest(). This function takes as parameters:

n (int): Number of rows to return
columns (list or str): Column(s) used in sorting
keep (‘first’, ‘last’ or False): Decides what to do with duplicate lines and the default is ‘first’, that is, maintains the first.

It will return a data frame with the n first lines of your data frame df column-ordered columns. In your case, stay:

df.nlargest(10, 'qtd_ordens')

The same result can be obtained by combining the functions .sort_values(), which sorts the data frame, and .head(), selecting the first lines:

df.sort_values(by='qtd_vendas', ascending=False).head(10)

Ai, to select only columns 'nome_cliente' and 'qtd_vendas', stays:

df.nlargest(10, 'qtd_ordens')[['nome_cliente','qtd_ordens']]