Select the top 10 values of a dataframe variable in python?

Asked

Viewed 874 times

-2

I need to figure out the top 10 values of the variable qty sales together with the information of client name of a sales base.

To find out the names of the 10 customers who bought the most.

Making df['qtd_ordens'].max() I have the maximum sales value, but I need the top 10 along with the names of those customers.

  • 1

    Welcome to Stack Overflow! Please explain the problem better, and if possible include a example of code that reproduces what is happening, because your question is not noticeable. See Help Center How to Ask.

  • 1

    What have you ever done? Post the code.

  • 1

    Also try to clarify the tools you’re using. It’s not just Python, it’s a Data Frame with Pandas. Whenever possible, put examples of expected input and output to help you understand exactly what you intend to do, what you have tried so far, and what the problem is.

1 answer

1


Pandas has the function .nlargest(). This function takes as parameters:

  • n (int): Number of rows to return
  • columns (list or str): Column(s) used in sorting
  • keep (‘first’, ‘last’ or False): Decides what to do with duplicate lines and the default is ‘first’, that is, maintains the first.

It will return a data frame with the n first lines of your data frame df column-ordered columns. In your case, stay:

df.nlargest(10, 'qtd_ordens')

The same result can be obtained by combining the functions .sort_values(), which sorts the data frame, and .head(), selecting the first lines:

df.sort_values(by='qtd_vendas', ascending=False).head(10)

Ai, to select only columns 'nome_cliente' and 'qtd_vendas', stays:

df.nlargest(10, 'qtd_ordens')[['nome_cliente','qtd_ordens']]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.