Deleting lines with repeated Labels on a Dataframe

Question

Deleting lines with repeated Labels on a Dataframe

Asked 4 years, 9 months ago

Viewed 50 times

1

I need to delete in a dataframe lines that contain repeated labals, as highlighted in spine "B":

Below is the result of how I would like it to stay after exclusion:

Welcome(a) to the platform. And, from now on, I dirty the reading of the following articles: How to ask a good question? and Manual on how NOT to ask questions. Both articles will teach you how to elaborate a good question, avoiding negative and even closing votes. Good luck! Take full advantage of our potential and always come back!

– Solkarped

2020/10/24 at 12:56

2 answers

2

Hello, you can wear df.drop_duplicates() to filter the fields. Pandas import is implicit and dataframe creation I will also use your example that is so:


df_sem_duplicacao = df.drop_duplicates(subset=['B'])
df_sem_duplicacao

The parameter subset receives a list with column labels. By default, df.drop_duplicates(), removing only those that are exactly equal. But that’s not what we want, so I use the parameter subset to specify where I want to apply the filter.

Browser other questions tagged python pandas

You are not signed in. Login or sign up in order to post.

by Terry • **889** points · Answer 1 · 2020-10-24T13:19:39+00:00

Tends to use the .drop_duplicates() in this way:

df = df.drop_duplicates(subset=['B'])