7
I have a data set similar to the one below. It has a column with dates and another with occurrences of groups on these dates.
data grupo
1 2019-01-01 a
2 2019-01-01 a
3 2019-01-01 a
4 2019-01-01 a
5 2019-01-02 b
6 2019-01-02 b
7 2019-01-02 a
8 2019-01-03 a
9 2019-01-03 a
10 2019-01-03 a
11 2019-01-04 a
12 2019-01-04 b
13 2019-01-04 b
14 2019-01-05 a
15 2019-01-05 a
16 2019-01-05 a
17 2019-01-05 b
18 2019-01-06 b
19 2019-01-06 a
20 2019-01-06 a
21 2019-01-07 b
22 2019-01-07 b
23 2019-01-07 a
24 2019-01-08 b
25 2019-01-08 a
26 2019-01-08 a
27 2019-01-09 a
28 2019-01-09 a
29 2019-01-09 b
30 2019-01-10 a
I want to calculate the cumulative sum of the occurrences of the groups in relation to the dates present. For example, in 2019-01-01, the group a
occurred 4 times. In 2019-01-02, a
occurred once, which gives a cumulative occurrence equal to 5. And so on, for each group and each date.
Dates are in sequence, but some are missing. Groups are not present on all dates. That being said, the answer I seek for the above set is as follows:
data grupo acumulada
2019-01-01 a 4
2019-01-01 b 0
2019-01-02 a 5
2019-01-02 b 2
2019-01-03 a 8
2019-01-03 b 2
2019-01-04 a 9
2019-01-04 b 4
2019-01-05 a 12
2019-01-05 b 5
2019-01-06 a 14
2019-01-06 b 6
2019-01-07 a 15
2019-01-07 b 8
2019-01-08 a 17
2019-01-08 b 9
2019-01-09 a 19
2019-01-09 b 10
2019-01-10 a 20
2019-01-10 b 10
Below are the sample data to facilitate the life of those who try to solve my problem.
structure(list(data = structure(c(17897, 17897, 17897, 17897,
17898, 17898, 17898, 17899, 17899, 17899, 17900, 17900, 17900,
17901, 17901, 17901, 17901, 17902, 17902, 17902, 17903, 17903,
17903, 17904, 17904, 17904, 17905, 17905, 17905, 17906), class = "Date"),
grupo = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L,
1L, 1L, 1L, 2L, 1L), .Label = c("a", "b"), class = "factor")),
class = "data.frame", row.names = c(NA, -30L))
I didn’t know the function
complete
. Thank you.– Marcus Nunes