Two things in the example you posted:
- The months are abbreviated in capital letters
- The months are in Portuguese
For the first case, use the title
resolves.
For the second case you have to use a function to replace
>>> df = pd.DataFrame({"data": ["01JUN2020:00:00:01", "01AUG2020:00:00:04"]})
>>> df
data
0 01JUN2020:00:00:01
1 01AUG2020:00:00:04
>>> df['nova_data'] = pd.to_datetime(df['data'].str.title(),format='%d%b%Y:%H:%M:%S')
>>> df
data nova_data
0 01JUN2020:00:00:01 2020-06-01 00:00:01
1 01AUG2020:00:00:04 2020-08-01 00:00:04
dataframe info
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 data 2 non-null object
1 nova_data 2 non-null datetime64[ns] <<<<<------
dtypes: datetime64[ns](1), object(1)
memory usage: 160.0+ bytes
For the second case
>>> df = pd.DataFrame({"data": ["01JUN2020:00:00:01", "01MAI2020:00:00:04"]})
>>> df['nova_data'] = pd.to_datetime(df['data'].str.replace('MAI', 'May'),format='%d%b%Y:%H:%M:%S')
This dataframe
>>> print(df)
data nova_data
0 01JUN2020:00:00:01 2020-06-01 00:00:01
1 01MAI2020:00:00:04 2020-05-01 00:00:04
>>>
Update based on the comment
>>> df = pd.DataFrame({"data": ["01FEV2020:00:00:01", "01MAI2020:00:00:04"]})
>>> df
data
0 01FEV2020:00:00:01
1 01MAI2020:00:00:04
>>> df['nova_data'] = pd.to_datetime(df['data'].str.replace('MAI', 'May').str.replace('FEV', 'Feb'),format='%d%b%Y:%H:%M:%S')
>>> df
data nova_data
0 01FEV2020:00:00:01 2020-02-01 00:00:01
1 01MAI2020:00:00:04 2020-05-01 00:00:04
Is it possible to perform str replace for all months (my database has all in Portuguese)? I tried it by dictionary but it was not working
– fjsfm
Post updated from your comment. I put only two months, but can do with those who need.
– Paulo Marques