1
I need to calculate the percentage of days that are in a certain range of values in the analyzed column (values are in percentage) in relation to the total quantity of days for that month.
Initially I survey how many days each month has, I do this mainly due to the month of February that may vary:
qtd_dias_mes <- dados %>%
group_by(mes) %>%
select(analise) %>%
dplyr::summarise(n()) %>%
dplyr::rename("qtd_dias_mes" = "n()")
View(qtd_dias_mes)
Now I survey how many days are between the interval that is greater than 75 and less than 100:
analise_75_e_100 <- dados %>%
filter(dados$analise > 75 & dados$analise < 100) %>%
group_by(mes) %>%
dplyr::summarise(n())
the values delivered by dplyr::summarise(n()) do not contemplate month 6 (with the complete dataframe), and when making the calculation I need (below), end up appearing month 6 (which should not) and wrong values from it, and should appear 0 for month 6:
porc_dias_entre_75_e_100 = (analise_75_e_100$`n()` / qtd_dias_mes$qtd_dias_mes) * 100
In order to simplify my question here, I presented above the calculation with this interval, however I will calculate for different percentage intervals, and may appear less not contemplated, and should then be filled with zero. What alternatives for these calculations to be correct?
My data have 10957 lines (1/1/1990 to 12/31/2019), maybe making with the sample below change the month 6 that I exemplified above, however the reasoning is the same.
> dput(head(dados, 200))
structure(list(data = c("1/1/1990", "1/2/1990", "1/3/1990", "1/4/1990",
"1/5/1990", "1/6/1990", "1/7/1990", "1/8/1990", "1/9/1990", "1/10/1990",
"1/11/1990", "1/12/1990", "1/13/1990", "1/14/1990", "1/15/1990",
"1/16/1990", "1/17/1990", "1/18/1990", "1/19/1990", "1/20/1990",
"1/21/1990", "1/22/1990", "1/23/1990", "1/24/1990", "1/25/1990",
"1/26/1990", "1/27/1990", "1/28/1990", "1/29/1990", "1/30/1990",
"1/31/1990", "2/1/1990", "2/2/1990", "2/3/1990", "2/4/1990",
"2/5/1990", "2/6/1990", "2/7/1990", "2/8/1990", "2/9/1990", "2/10/1990",
"2/11/1990", "2/12/1990", "2/13/1990", "2/14/1990", "2/15/1990",
"2/16/1990", "2/17/1990", "2/18/1990", "2/19/1990", "2/20/1990",
"2/21/1990", "2/22/1990", "2/23/1990", "2/24/1990", "2/25/1990",
"2/26/1990", "2/27/1990", "2/28/1990", "3/1/1990", "3/2/1990",
"3/3/1990", "3/4/1990", "3/5/1990", "3/6/1990", "3/7/1990", "3/8/1990",
"3/9/1990", "3/10/1990", "3/11/1990", "3/12/1990", "3/13/1990",
"3/14/1990", "3/15/1990", "3/16/1990", "3/17/1990", "3/18/1990",
"3/19/1990", "3/20/1990", "3/21/1990", "3/22/1990", "3/23/1990",
"3/24/1990", "3/25/1990", "3/26/1990", "3/27/1990", "3/28/1990",
"3/29/1990", "3/30/1990", "3/31/1990", "4/1/1990", "4/2/1990",
"4/3/1990", "4/4/1990", "4/5/1990", "4/6/1990", "4/7/1990", "4/8/1990",
"4/9/1990", "4/10/1990", "4/11/1990", "4/12/1990", "4/13/1990",
"4/14/1990", "4/15/1990", "4/16/1990", "4/17/1990", "4/18/1990",
"4/19/1990", "4/20/1990", "4/21/1990", "4/22/1990", "4/23/1990",
"4/24/1990", "4/25/1990", "4/26/1990", "4/27/1990", "4/28/1990",
"4/29/1990", "4/30/1990", "5/1/1990", "5/2/1990", "5/3/1990",
"5/4/1990", "5/5/1990", "5/6/1990", "5/7/1990", "5/8/1990", "5/9/1990",
"5/10/1990", "5/11/1990", "5/12/1990", "5/13/1990", "5/14/1990",
"5/15/1990", "5/16/1990", "5/17/1990", "5/18/1990", "5/19/1990",
"5/20/1990", "5/21/1990", "5/22/1990", "5/23/1990", "5/24/1990",
"5/25/1990", "5/26/1990", "5/27/1990", "5/28/1990", "5/29/1990",
"5/30/1990", "5/31/1990", "6/1/1990", "6/2/1990", "6/3/1990",
"6/4/1990", "6/5/1990", "6/6/1990", "6/7/1990", "6/8/1990", "6/9/1990",
"6/10/1990", "6/11/1990", "6/12/1990", "6/13/1990", "6/14/1990",
"6/15/1990", "6/16/1990", "6/17/1990", "6/18/1990", "6/19/1990",
"6/20/1990", "6/21/1990", "6/22/1990", "6/23/1990", "6/24/1990",
"6/25/1990", "6/26/1990", "6/27/1990", "6/28/1990", "6/29/1990",
"6/30/1990", "7/1/1990", "7/2/1990", "7/3/1990", "7/4/1990",
"7/5/1990", "7/6/1990", "7/7/1990", "7/8/1990", "7/9/1990", "7/10/1990",
"7/11/1990", "7/12/1990", "7/13/1990", "7/14/1990", "7/15/1990",
"7/16/1990", "7/17/1990", "7/18/1990", "7/19/1990"), dia_da_semana = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 1L, 2L, 3L, 4L), mes = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L), ano = c(1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L,
1990L, 1990L, 1990L, 1990L), analise = c(0, 0, 0, 0, 100, 100,
100, 100, 100, 100, 100, 78, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 100, 56.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
100, 100, 100, 100, 100, 100, 36.23, 0, 0, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 59.74, 0, 0, 0, 0, 100, 78, 0,
0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 66.46, 100, 56.01, 0, 0,
0, 0, 100, 100, 72.23, 0, 0, 0, 0, 0, 100, 100, 100, 100, 78,
100, 100, 100, 44.46, 100, 100, 100, 100, 100, 100, 100, 59.74,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 100, 56.01, 0, 0, 0, 0, 100,
100, 100, 54.97, 0, 0, 0, 0, 100, 56.01, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 100, 100, 75.35, 0, 0, 0)), row.names = c(NA,
200L), class = "data.frame")
When you say month, means year and month?
– Rui Barradas