Removing non-numerical value from a dataframe

Asked

Viewed 24 times

-1

Data Frame

My intention is to remove the values that appear with '...' as shown above and replace with an empty field.

The code I’m using to try to remove is this:

df['Energy Supply'].str.replace('[.]*', '')

however returns an output where all values become Nan

Saída

How could I fix this problem?

1 answer

1


One option is to use the regex option of replace: df.replace('\.+', np.nan, regex=True)

Following example replicable:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [0, 1, 2, 3, 4],
                   'B': ['......', 6, '.......', 8, 9],
                   'C': ['........', 'b', 'c', '....', 'e']})

print(df.replace('\.+', np.nan, regex=True))

Upshot:

   A    B    C
0  0  NaN  NaN
1  1  6.0    b
2  2  NaN    c
3  3  8.0  NaN
4  4  9.0    e
  • It worked .... thank you!

  • If this answer solved your problem and there is no doubt left, mark it as correct/accepted by clicking on the " " that is below the vote counter of the answer, which also marks your question as solved.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.