-1
I have a date frame whose columns have dates (%Y/%m/%d), times and averages per hour over 4 months (01/01/2020 - 01/04/2020). I wonder how I could calculate the average of these hourly values, for each day, by making use of the Perator pipe (%>%) or otherwise faster. Look at my code below:
library(tidyverse)
library(lubridate)
head(dados)
Data Hora Nome.Parâmetro Unidade.Medida Média.Horária
1 2020-01-03 01:00 MP10 (Partículas Inaláveis) µg/m3 12
2 2020-01-03 02:00 MP10 (Partículas Inaláveis) µg/m3 13
3 2020-01-03 03:00 MP10 (Partículas Inaláveis) µg/m3 4
4 2020-01-03 04:00 MP10 (Partículas Inaláveis) µg/m3 7
5 2020-01-03 05:00 MP10 (Partículas Inaláveis) µg/m3 16
6 2020-01-03 06:00 MP10 (Partículas Inaláveis) µg/m3 11
I executed the following command:
head(dados %>%
group_by(Data) %>%
summarise(med_dia = mean(dados$Média.Horária))
)
Data med_dia
<date> <dbl>
1 2020-01-03 22.8
2 2020-01-04 22.8
3 2020-01-05 22.8
4 2020-01-06 22.8
5 2020-01-07 22.8
6 2020-01-14 22.8
After executing the above code, I expected the calculation of hourly averages per day. However, the command sums all columns indiscriminately and repeats the value on all rows.
Instead of
mean(dados$Média.Horária)
try removing the base name,mean(Média.Horária)
.– Rui Barradas
To have the data in an easier way to copy to an R session, can you please, edit the question with the departure of
dput(dados)
or, if the base is too large,dput(head(dados, 20))
?– Rui Barradas