How to filter data in a data.frame using a certain amount of time in R?

Asked

Viewed 826 times

1

Suppose I have one data.frame four-column:

print(Dados)

CLIENTE QTDE VALOR$ DATA_COMPRA
1234    2    50     2019-02-04
4586    1    70     2019-01-17
6535    3    25     2018-12-28
9562    1    150    2018-12-25
3478    7    12     2018-10-12
2684    4    33     2018-06-03
1593    25   2      2017-12-31

My goal is to filter out the customers who’ve been shopping for the last three months. Customers who are between the month 2019 2 and the month 2018 12:

print(Dados_Filtrados)

CLIENTE QTDE VALOR$ DATA_COMPRA
1234    2    50     2019-02-04
4586    1    70     2019-01-17
6535    3    25     2018-12-28
9562    1    150    2018-12-25

But this data.frame has the following condition:

This data.frame is updated monthly. That is, I can’t filter by the last three names of the month or by a specific date, because every update the month will change.

I believe that the solution would be to filter the data.frame using a certain period of time. But how can I do this?

  • 1

    Why the month 2 2019? Is this chosen by the user? We are already in the month 4, so it should not be through Sys.Date().

  • @Rui Barradas I specified month 2 for being in agreement with the example of the data I put, this table is not real, the month 2 from 2019 to month 12 of 2018 was just an example period. But if you want I can change the data to facilitate understanding.

1 answer

3


The following function filters the data by date, passed as argument mes. It is assumed that the date format is year-month-day, and therefore a full date must be passed, but it can be any other format, as long as it is also passed in the argument formato.

library(lubridate)

filtrarMes <- function(DF, mes, formato = "%Y-%m-%d"){
  if(!grepl("%d", formato)){
    mes <- paste(mes, "01", sep = "-")
    formato <- paste(formato, "%d", sep = "-")
  }
  mes <- as.Date(mes, format = formato)
  prim <- mes - months(3)
  i <- which(DF[["DATA_COMPRA"]] >= prim)
  DF[i, ]
}

filtrarMes(Dados, mes = "2019-2", formato = "%Y-%m")
#  CLIENTE QTDE VALOR. DATA_COMPRA
#1    1234    2     50  2019-02-04
#2    4586    1     70  2019-01-17
#3    6535    3     25  2018-12-28
#4    9562    1    150  2018-12-25

filtrarMes(Dados, mes = Sys.Date())
#  CLIENTE QTDE VALOR. DATA_COMPRA
#1    1234    2     50  2019-02-04
#2    4586    1     70  2019-01-17

Dice.

Dados <- read.table(text = "
CLIENTE QTDE VALOR$ DATA_COMPRA
1234    2    50     2019-02-04
4586    1    70     2019-01-17
6535    3    25     2018-12-28
9562    1    150    2018-12-25
3478    7    12     2018-10-12
2684    4    33     2018-06-03
1593    25   2      2017-12-31
", header = TRUE)

Dados$DATA_COMPRA <- as.Date(Dados$DATA_COMPRA)
  • Thanks for the reply @Ruibarradas! I will test.

  • Worked perfectly!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.