In a df, how to select items in column with datetime format, classify them into periods (morning, afternoon and evening) and place them in a new column

Asked

Viewed 30 times

-2

Hello!!! I am looking for help to solve the problem below.

My df has a ' DATA column with format 29/01/2019 17:50:11), which by default is converted to 'object' type pandas. All in all, it has 640 lines. I’ve tried many things, my best advance was.

Create a range for each period and compare them with the column data.

periodo1 = pd.date_range('2020-10-15 05:00','2021-03-18 12:00',freq='1T')
periodo2 = pd.date_range('2020-10-15 12:01','2021-03-18 18:00',freq='1T')
periodo3 = pd.date_range('2020-10-15 18:01','2021-03-18 04:59',freq='1T')

for index, row in df.iterrows( ):
     if row ['DATA1'] in periodo1:
        df.loc[index,'DATA1'] = 'Manhã'
    if row ['DATA1'] in periodo2:
        df.loc[index,'DATA1'] = ‘Tarde'
    if row ['DATA1'] in periodo3:
        df.loc[index,'DATA1'] = 'Noite'

The first part with respect to period1 the Cod. wheel perfectly. However it fills DATA1 (new column), all with 'Morning' and disregards the rest of the code.

Any help is welcome. I thank you in advance.

  • The simple quote in the late period seems to be using a different encoding, I don’t know if it’s like this in the code or just here on the site, but maybe that’s it. If you copied and pasted the code here, it’s worth taking a look at

1 answer

0


Follow the steps below

Creating Test Dataframe

>>> import pandas as pd
>>> df = pd.DataFrame({"DataString": ["29/01/2019 17:50:11", "29/01/2020 3:50:00", "29/01/2021 21:50:11"]})
>>> df
            DataString
0  29/01/2019 17:50:11
1   29/01/2020 3:50:00
2  29/01/2021 21:50:11

Converting data string to type datetime

>>> df["Data"] = pd.to_datetime(df["DataString"], format="%d/%m/%Y %H:%M:%S")
>>> df
            DataString                Data
0  29/01/2019 17:50:11 2019-01-29 17:50:11
1   29/01/2020 3:50:00 2020-01-29 03:50:00
2  29/01/2021 21:50:11 2021-01-29 21:50:11

Defining function. (to determine morning, afternoon and night, only the time is needed, since the mensma is in format 24h)

def categoriza(x):
    if x.hour in range(6, 13):
        return "manha"
    elif x.hour in range(12, 19):
        return "tarde"
    elif x.hour in range(18, 24):
        return "noite"
    elif x.hour in range(0, 6):
        return "madrugada"

Applies function

>>> df['categoria'] = df["Data"].apply(categoriza)

>>> df

            DataString                Data  categoria
0  29/01/2019 17:50:11 2019-01-29 17:50:11      tarde
1   29/01/2020 3:50:00 2020-01-29 03:50:00  madrugada
2  29/01/2021 21:50:11 2021-01-29 21:50:11      noite

Note You can go overwriting the columns. I only did this way to get more didactic

  • Paulo!! It fell like a glove. it spun perfectly. 1 hug.

  • Oops! Be sure to mark the answer as a solution...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.