Value Replace with Python and Pandas

Asked

Viewed 45 times

0

Hello, I’m trying to remove a character from a base that I extracted from a TXT file, I loaded the file with pandas but I can’t remove it using replace command, below the values I have in a certain column.

0 [23/Mar/2021:00:00:00
1 [23/Mar/2021:00:00:00
2 [23/Mar/2021:00:00:00
3 [23/Mar/2021:00:00:00
4 [23/Mar/2021:00:00:00

I am using the following code to remove "[" with the following command:

y[3]replace("[","")
0 [23/Mar/2021:00:00:00
1 [23/Mar/2021:00:00:00
2 [23/Mar/2021:00:00:00
3 [23/Mar/2021:00:00:00
4 [23/Mar/2021:00:00:00
Name: 3, dtype: object

by what I understand it is not removing the character because the field type is as object, I need to do this for other columns of my file however I am bumping in this conversion, already tried the command below to change the column to string however unsuccessfully:

y[3] = y[3].astype(str)
y[3]
Name: 3, dtype: object
  • you can provide the data or make an MWE?

2 answers

1

Having the dataframe below:

>>> df
                     txt
0  [23/Mar/2021:00:00:00
1  [23/Mar/2021:00:00:00
2  [23/Mar/2021:00:00:00
3  [23/Mar/2021:00:00:00

Just do

df['txt'] = df['txt'].str.replace('\[', '')

print(df)
0    23/Mar/2021:00:00:00
1    23/Mar/2021:00:00:00
2    23/Mar/2021:00:00:00
3    23/Mar/2021:00:00:00
Name: txt, dtype: object

I believe you want to turn the field into datetime as well, so do

df['data'] = pd.to_datetime(df['txt'], format='%d/%b/%Y:%H:%M:%S')

print(df)

                    txt       data
0  23/Mar/2021:00:00:00 2021-03-23
1  23/Mar/2021:00:00:00 2021-03-23
2  23/Mar/2021:00:00:00 2021-03-23
3  23/Mar/2021:00:00:00 2021-03-23

If you only need the dates, you can do it at once

df['data'] = pd.to_datetime(df['txt'], format='[%d/%b/%Y:%H:%M:%S')

print(df)
                     txt       data
0  [23/Mar/2021:00:00:00 2021-03-23
1  [23/Mar/2021:00:00:00 2021-03-23
2  [23/Mar/2021:00:00:00 2021-03-23
3  [23/Mar/2021:00:00:00 2021-03-23

then throw the column txt out of:

df.drop("txt", axis=1, inplace=True)

-5


Use it like this:

text = pd.read_csv('dados.txt', header=None)
text = text.replace('\[', '', regex=True)
  • 1

    I followed this one too and it worked ok, thanks so much for the help.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.