In the example below I created a small DataFrame
containing two columns and four rows.
The date of 3/10/2020
repeats twice and it is she who will be filtered.
Creating the Dataframe
>>> import pandas as pd
>>> df = pd.DataFrame({"DT_REFER": ["1/10/2020", "2/10/2020", "3/10/2020", "3/10/2020"],
"OUTRA": ["banana", "laranja", "abacaxi", "goiaba"]})
>>> df
DT_REFER OUTRA
0 1/10/2020 banana
1 2/10/2020 laranja
2 3/10/2020 abacaxi
3 3/10/2020 goiaba
Converting the column DT_REFER
of string
for datetime
>>> df["DT_REFER"] = pd.to_datetime(df['DT_REFER'], format="%d/%m/%Y")
>>> df
DT_REFER OUTRA
0 2020-10-01 banana
1 2020-10-02 laranja
2 2020-10-03 abacaxi
3 2020-10-03 goiaba
Filtering by date and playing the result for a new Dataframe
>>> new_df = df[(df["DT_REFER"] == "2020-10-03")]
>>> new_df
DT_REFER OUTRA
2 2020-10-03 abacaxi
3 2020-10-03 goiaba
UPDATE: In time, to find the latest date, just take the maximum amount (max
) column DT_REFER
>>> recent_date = df['DT_REFER'].max()
>>> recent_date
Timestamp('2020-10-03 00:00:00')
UPDATE
Based on your comments, you can:
>>> df
DT_REFER OUTRA VERSAO
0 2020-10-01 banana 2020-10-05
1 2020-10-02 laranja 2020-10-06
2 2020-10-03 abacaxi 2020-10-10
3 2020-10-03 goiaba 2020-10-03
>>> new_df = df[(df["DT_REFER"] == df["DT_REFER"].max()) & (df["VERSAO"] == df["VERSAO"].max())]
>>> new_df
DT_REFER OUTRA VERSAO
2 2020-10-03 abacaxi 2020-10-10
Note that the operator &
represents AND
and the operator |
(not used in this query, represents OR
Here has a reference if you want.
I hope it helps
What filter expression are you using? The column
DT_REFER
is it already date or string? If string, convert usingpd.to_datetime
.– Paulo Marques
were not in date format, I ended up converting, thank you
– Saulo