In Python, search a string from the information of a column?

Asked

Viewed 263 times

2

I’m at Jupyter Notebook working with Python.

Has a dataframe with the name and text fields, in the text, which is a txt loaded, I want to search if there is the string that is exactly the value in the name field.

Dataframe:

Imagem do dataframe

Wish to result in all rows of the dataframe resulting in whether the name field value was found in the text or not.

2 answers

3

If I understand correctly, you need to of all rows containing the contents of the column nome in the column texto.

You can try something like the code line below, with the method isin, if to compare only what contains the value, but does not need to be exactly the same:

df.loc[df.nome.isin(df['texto'])]

If you want the same values, you can directly compare the values of each record in the columns. The code line below returns the records with True or False for each one compared:

df['nome'] == df['texto']

I hope I’ve helped in some way.

  • 1

    Thanks Ruan, that’s right you understood, only that I did not get results with these codes, I got a result with the code below: 
data[['nome' in x for x in data['texto']]]
 Your second code came all fake, I think you should be comparing contents of fields, but the name field is always very dry, the text is text pages and I have to find in this exactly the contents of the name field. But I’m still analyzing and seeing if I can do better. Gratefully

1

CONTAINS

You could use the CONTAINS method which is similar to the LIKE of SQL.

df = df[df['nome'].str.contains('INVESTIMENTOS', na = False)]

In this case the exit would be that the DF would now have only items related to INVESTMENTS in the column NAME.

MATCH

Match is when it returns a precise word, example:

df = df[df['nome'].str.match('GIRASSOL FUNDO DE INVESTIMENTO EM ACOES', na = False)]

You would return only the DF that would have only "SUNFLOWER STOCK INVESTMENT FUND" in the NAME column and all its affiliations in the TEXT column.

  • 1

    I understood, but instead of putting the precise word, I want it to contain the content of the value column, and this can be found in the text column.

  • Parciliano, could you give me an example so I can see in a more structured view?

  • Hi @Fr0st, my current dataframe is in the following format - data.Olumns['name', 'filename', 'text']. All columns are strings, wish, take the 'name' column and check if all the contents of it meet exactly somewhere in the string of the 'text' column and, if possible, have as a result all results if found or not. Subsequently I wish to have this result exported to csv. Grateful.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.