Replace specific strings in the dataframe with empty values

Asked

Viewed 188 times

-1

I have a file. csv I’m treating, and the column ap_residencia_status should only be populated with floats but is filled with strings (even numeric values are strings) and the values that should be empty are filled with the string "N/D"

I would like to know how to filter and remove only the terms with the N/D string from my column, preserving the numeric values that are in string format (I want to change all to float once N/D is removed from the column)

inserir a descrição da imagem aqui

2 answers

1


is_not_nd = df.ap_residencia_estado != "N/D"
df = df[is_not_nd]

The first line returns a series Boolean containing True for values that are different from N/D.

This filter is used in the second row to select only the values True.

  • Comrade, for some reason, your code removes some data from meudataframe, from 4027 lines to 9 columns fell to 3985 to 9 columns

  • 1

    Yes. Wasn’t this the intention? Remove lines with N/D? You said in your question: Pretendo alterar todos para float assim que remover os N/D da coluna. Cannot remove N/D from column without removing entire row from Dataframe, friend. Otherwise, you would have a column with fewer rows than your Dataframe, and the pandas would send an error message.

  • Yes, it was. I’m sorry, I think I misquoted my intention. I would like to remove the string that is filling the matrix value, but I wanted it to appear "Nan" as it is in the other column. So that only an empty value is left in the matrix , not filled by string N/D

0

I will test your method, but I also managed to solve the problem by doing this technique here. :)

#4) Convertendo os valores string em formato float e substituindo os valores N/D da coluna por valores vazios
df['ap_residencia_estadia'] = pd.to_numeric(df['ap_residencia_estadia'],errors='coerce')

Browser other questions tagged

You are not signed in. Login or sign up in order to post.