Filter Dataset from a groupby

Asked

Viewed 53 times

0

I have the following dataframe (df_reviews) and need to remove the App versions that were downloaded less than 10 times and tried to do the steps below and a for to perform the filter on the original dataset, but I was not successful Down I put the step-by-step of my way...

df_review

By "App Version Name" I made a filter to evaluate how many times each version was downloaded:

App_version = df_reviews.groupby('App Version Name').agg({'Star Rating': ['mean', 'min', 'max', 'count']})

And the result was this:

agg

After that I created a condition to clean up df_reviews, where the version with less than 10 downloads should be dropped from df:

condicao1 = df_reviews['App Version Name'].value_counts() < 10

cond1

And to filter df_reviews, I made a for to remove lines whose version has less than 10 downloads, but the for is not working.

for i in range (0, df_reviews.shape[0]):#buscando em linha por linha
    if df_reviews.at[i, "App Version Name"] == condicao1:#serve para acessar o label
        df_reviews['Teste1'] = df_reviews['Teste1'].append(condicao1.loc[[i]])```
  • 1

    Try df_reviews[df_reviews['App Version Name'].isin(condicao1[condicao1 == False].index)]. This chooses the names that are among those that are False in condicao1.

  • It worked out! Thank you very much (And much more practical)

1 answer

0


Try

df_reviews[df_reviews['App Version Name'].isin(condicao1[condicao1 == False].index)]

That chooses the names that are among those that are False in condicao1. - Jorge Mendes 1 hour ago

by Jorge Mendes: /users/150167/jorge-mendes

Browser other questions tagged

You are not signed in. Login or sign up in order to post.