How do I filter a Dataframe row by knowing a String value from one of its columns?

Asked

Viewed 97 times

1

Let’s say I have at hand this dataframe

And I know that a specific value inside a column called "Code", contains a String that I will call "mxrf11", in the whole dataframe will not have another name 'mxrf11'' in the column "code" (I am commenting on this because I do not know if it is relevant to have repeated values, but in the case of this Dataframe no name of the column "Codes" is repeated)

How do I pull the Dataframe information line related to the code "mxrf11"?

inserir a descrição da imagem aqui

Edit 1 : Trying the solution of the first comment

inserir a descrição da imagem aqui

Something strange happened, he gave the names of the columns of the Dataframe instead of the information associated with the line

inserir a descrição da imagem aqui

Interestingly, the same problem happened in the second solution suggested in the post.

Edit 2 - The problem was that the letters were lowercase so Dataframe did not return anything. By replacing with the uppercase letter all methods returned the expected information. :)

inserir a descrição da imagem aqui

2 answers

2


Take the example:

Creating Dataframe Example

>>> import pandas as pd

>>> df = pd.DataFrame({"frutas": ["banana", "laranja", "pera", "uva", "banana", "pera"], "codigo": ["um", "dois", "mxrf11", "quatro", "cinco", "seis"]})

>>> df
    frutas  codigo
0   banana      um
1  laranja    dois
2     pera  mxrf11
3      uva  quatro
4   banana   cinco
5     pera    seis

Filtering by the code

>>> df[df["codigo"] == "mxrf11"]

  frutas  codigo
2   pera  mxrf11

Assigning result to another Dataframe

>>> df_filtrado = df[df["codigo"] == "mxrf11"]

>>> print(df_filtrado)

  frutas  codigo
2   pera  mxrf11

Filtering by PART of the code

>>> df[df['codigo'].str.contains("mxr")]

  frutas  codigo
2   pera  mxrf11

I hope it helps

  • Comrade, something strange happened, he returned the name of the columns and not the information contained in them. I edited my question with the solution you mentioned

  • 1

    @Kioolz, just below the line df.columns = title use df[df['codigo'].str.contains("ALMI11")] . This will have to return the line with index 3. It worked for me.

  • It worked, it was the lowercase letters!

1

Creating Data Frame Test

import pandas as pd

codigos = ['cod1','mxrf11','cod2','mxrf11','cod3','mxrf11']
valores = ['teste1','teste2','teste3','teste4','teste5','teste6']

df = pd.DataFrame({'Codigos':codigos, 'Valores':valores})

Use the isin

busca = ['mxrf11']
df[df['Codigos'].isin(busca)]

Exit

   Codigos  Valores
1   mxrf11  teste2
3   mxrf11  teste4
5   mxrf11  teste6

Use the Loc

df[df.loc[:,'Codigos'] == 'mxrf11']

Exit

   Codigos  Valores
1   mxrf11  teste2
3   mxrf11  teste4
5   mxrf11  teste6
  • The same thing happened that in my first post Dit, I’ll attach another image and upload the code to Github to get a better view of what’s going on.

  • 1

    @Kioolz, good night! What happens is that the code column of your example has uppercase values, so you should do the search with the corresponding uppercase, in this case: MXRF11. This way the values will be returned. Hug!

  • 1

    It was exactly that, the search in Python for all the methods of the post are sensitive to the difference between lowercase and uppercase. I tested all the methods that you showed and all worked perfectly, I thank you immensely for your help :)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.