0
df
is the data frame with the data that I will work with
df <- data.frame(matrix(NA,nrow = 30, ncol = 6))
colnames(df) <- c("ano","mes","x1","x2","x3","x4")
df$ano <- c(2002,2002,2002,2003,2003,2003,2004,2004,2004,2004,2002,2002,2002,2003,2003,2003,2004,2004,2004,2004,2002,2002,2002,2003,2003,2003,2004,2004,2004,2004)
df$mes <- c(5,9,7,3,8,2,4,1,6,10,2,11,12,10,12,2,8,9,10,7,6,4,5,7,3,1,9,5,2,6)
df$x1 <- c(2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60)
df$x2 <- c(3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57,60,63,66,69,72,75,78,81,84,87,90)
df$x3 <- c(4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76,80,84,88,92,96,100,104,108,112,116,120)
df$x4 <- c(5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150)
View(df)
resumo
is the data frame where I want my analysis to be recorded in order to export to . csv at the end
resumo <- data.frame(matrix(NA,nrow = 12, ncol = 7))
colnames(resumo) <- c("mes","media_diaria_X1","media_mensal_X1","media_diaria_X4","media_mensal_X4","media_diaria_x3_divido_x4","media_mensal_x3_divido_x4")
resumo$mes <- c(1,2,3,4,5,6,7,8,9,10,11,12)
View(resumo)
library(dplyr)
When I use the dplyr package it returns me values with the columns named equal to the columns of the data frame df
as the selection is performed by the columns of df
.
The big question is that I would like the results obtained below to be recorded in the data frame resumo
with the columns named according to what I defined above .
Below are the calculations I managed to do and I intend to do with the help of someone:
X1 and X4 Daily Averages Calculation
media_diaria <- df %>%
group_by(mes) %>%
summarise_at(vars(x1,x4), mean, na.rm = TRUE)
dput(media_diaria)
structure(list(mes = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
x1 = c(34, 31, 29, 29, 34.6666666666667, 40, 31.3333333333333,
22, 31.3333333333333, 28.6666666666667, 24, 28), x4 = c(85,
77.5, 72.5, 72.5, 86.6666666666667, 100, 78.3333333333333,
55, 78.3333333333333, 71.6666666666667, 60, 70)), row.names = c(NA,
-12L), class = c("tbl_df", "tbl", "data.frame"))
Calculation of monthly averages (referring to monthly sum) of X1 and X4
soma_mensal <- df %>%
group_by(ano,mes) %>%
summarise_at(vars(x1,x4), sum, na.rm = TRUE)
media_mensal <- soma_mensal %>%
group_by(mes) %>%
summarise_at(vars(x1,x4), mean, na.rm = TRUE)
dput(soma_mensal)
structure(list(ano = c(2002, 2002, 2002, 2002, 2002, 2002, 2002,
2002, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2004, 2004, 2004,
2004, 2004, 2004, 2004, 2004, 2004), mes = c(2, 4, 5, 6, 7, 9,
11, 12, 1, 2, 3, 7, 8, 10, 12, 1, 2, 4, 5, 6, 7, 8, 9, 10), x1 = c(22,
44, 48, 42, 6, 4, 24, 26, 52, 44, 58, 48, 10, 28, 30, 16, 58,
14, 56, 78, 40, 34, 90, 58), x4 = c(55, 110, 120, 105, 15, 10,
60, 65, 130, 110, 145, 120, 25, 70, 75, 40, 145, 35, 140, 195,
100, 85, 225, 145)), row.names = c(NA, -24L), groups = structure(list(
ano = c(2002, 2003, 2004), .rows = structure(list(1:8, 9:15,
16:24), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
dput(media_mensal)
structure(list(mes = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
x1 = c(34, 41.3333333333333, 58, 29, 52, 60, 31.3333333333333,
22, 47, 43, 24, 28), x4 = c(85, 103.333333333333, 145, 72.5,
130, 150, 78.3333333333333, 55, 117.5, 107.5, 60, 70)), row.names = c(NA,
-12L), class = c("tbl_df", "tbl", "data.frame"))
From that point on the calculations didn’t work out and I can’t solve.
Calculations of the daily average with the result obtained from the divorce of X3 by X4. Did not work!
media_diaria_divisao <- df %>%
group_by(mes) %>%
summarise_at((df$x3/df$x4), mean, na.rm = TRUE)
Error: `.vars` must be a character/numeric vector or a `vars()` object, not a double vector.
Run `rlang::last_error()` to see where the error occurred.
Calculation of monthly averages (referring to monthly sum) of X1 and X4. Also failed!
soma_mensal <- df %>%
group_by(ano,mes) %>%
summarise_at((df$x3/df$x4), sum, na.rm = TRUE)
media_mensal <- soma_mensal %>%
group_by(mes) %>%
summarise_at((df$x3/df$x4), mean, na.rm = TRUE)
Error: `.vars` must be a character/numeric vector or a `vars()` object, not a double vector.
Run `rlang::last_error()` to see where the error occurred.
Summing up my two doubts:
How I can perform the above 2 analyses in relation to the X3/X4 division with the
dplyr
? If using thedplyr
not be a good option, what would be a possible solution using functions? I’m trying to avoid working with loops.How do I feed the dataframe
resumo
with the results of the above analyses?
I didn’t understand the difference between
media_diaria
andmedia_mensal
. The results are equal, at least in the example provided.– Marcus Nunes
Marcus Nunes considering the X1 column for example: for the calculation of media_diary, a grouping was performed with all the data referring to the month of January, for example, which is represented by the number 1 in the dataframe, and then I averaged. For the monthly media_first, a sum of the values of each month in relation to its year was performed and then the average of this sum was performed. January/2002 had a summation, January/2003 had another summation and so on. Then I performed the average of this sum and obtained the monthly medium.
– Bruna
Images are hard to see, post values and format with 4 margin spaces, please. As for the code, you will not be looking for
summarise(med_dia_divisao = mean(x3/x4, na.rm = TRUE))
in the first that did not work? In the others the solution should be similar.– Rui Barradas
Rui Barradas removed the images and placed
dput
of each of the data frames. As for the suggestion you presented me, it did not work, I had tried this way before, but it returns me a single value for all months (value=0.8), and it should be a specific value for each month. Marcus Nunes, maybe with thedput
that I added, the visualization is better. As there are few values, in some cases media_diaria and media_monthly are coinciding, but if you observe the month 10 (October) already has difference between them.– Bruna