Date sequence from a range in R

Asked

Viewed 95 times

2

Good morning, I have a base with ID, start date and end date. I wanted to generate a basis that each row be one day within the range with an ID column. Groundwork:

 base <- data.frame( ID = c("A","B","C"),
                    data_inicio = c("2010-01-05", "2010-01-07", "2020-11-30"),
                    data_fim = c("2010-01-07", "2010-01-09", "2020-12-03"))

The exit would be:

DATE ID
A 2010-01-05
A 2020-01-06
A 2020-01-07
B 2020-01-07
B 2020-01-08

etc....

3 answers

4


Using dplyr and lubridate you can use rowwise that enables you to work line by line:

library(dplyr)
library(lubridate)

nova_base <- base %>% 
 rowwise() %>%
 do(data.frame(ID = .$ID, DATA = seq(ymd(.$data_inicio), ymd(.$data_fim), by = '1 day')))

The function seq() takes an initial and a final value, thus calculating the intermediate values.

Exit:

  ID       DATA      
  A     2010-01-05
  A     2010-01-06
  A     2010-01-07
  B     2010-01-07
  B     2010-01-08
  B     2010-01-09
  C     2020-11-30
  C     2020-12-01
  C     2020-12-02
  C     2020-12-03
  • I need a line for each date. The suggestion you put gives me the difference of days between the variables data_start and data_end.

  • Talk Rubens, all good? I added a new answer. I hadn’t really looked at the exit you were expecting. Hug!!

3

Similar to reply from @lmonferrari, but using data.table:

library(data.table)

setDT(base)  # estabelece como data.table

> base[, .(DATA = seq(as.IDate(data_inicio), as.IDate(data_fim), by = "1 day")), ID]
    ID       DATA
 1:  A 2010-01-05
 2:  A 2010-01-06
 3:  A 2010-01-07
 4:  B 2010-01-07
 5:  B 2010-01-08
 6:  B 2010-01-09
 7:  C 2020-11-30
 8:  C 2020-12-01
 9:  C 2020-12-02
10:  C 2020-12-03

Alternatively, first convert the relevant columns:

base[, 2:3 := lapply(.SD, as.IDate), .SDcols = 2:3]

base[, .(DATA = seq(data_inicio, data_fim, by = "1 day")), ID]

The class Idatetime is part of the package data.table; its use is similar to the methods used by lubridate.

See help for data.table::fread to see how to load your data directly as data table. and with relevant columns as Idate.

3

A solution in R base may be as follows.

saida <- apply(base, 1, function(x) {
  x <- unname(x)
  cbind.data.frame(
    ID = x[1],
    DATA = seq(as.Date(x[2]), as.Date(x[3]), by = "1 day")
  )
})
saida <- do.call(rbind, saida)

saida
#   ID       DATA
#1   A 2010-01-05
#2   A 2010-01-06
#3   A 2010-01-07
#4   B 2010-01-07
#5   B 2010-01-08
#6   B 2010-01-09
#7   C 2020-11-30
#8   C 2020-12-01
#9   C 2020-12-02
#10  C 2020-12-03

Browser other questions tagged

You are not signed in. Login or sign up in order to post.