Calculate the variance of the values related to a group interval? In soft R

Asked

Viewed 40 times

2

Example have the database as follows:

MES  RESP
1   4.67
1   5.11
2   5.22
2   4.99
3   4.60
3   5.39
4   4.98
4   5.29
5   5.82
5   5.01
6   5.90
6   4.22
7   4.40
7   4.69

How do I calculate the Variance, Mean, Standard Deviation between months 2 to 4 and 6 to 7 together? i.e., Variance, Mean, Standard Deviation of the following values without having to delete anything from the database

MES  RESP
2   5.22
2   4.99
3   4.60
3   5.39
4   4.98
4   5.29
6   5.90
6   4.22
7   4.40
7   4.69

1 answer

1


What you ask for can be done with family functions *apply, after a function has been defined to calculate the statistics of interest.

estatisticas <- function(x, na.rm = TRUE){
  m <- mean(x, na.rm = na.rm)
  v <- var(x, na.rm = na.rm)
  s <- sd(x, na.rm = na.rm)
  c(Media = m, Var = v, DesvPadrao = s)
}

As only the base lines with month are required MES from 2 to 4 and 6 or 7, a sub-base is created with these lines.

dados2 <- subset(dados, MES %in% c(2:4, 6:7))

First I will calculate the statistics for each value of MES.

res <- tapply(dados2$RESP, dados2$MES, FUN = estatisticas)
res <- do.call(rbind, res)
res
#  Media     Var DesvPadrao
#2 5.105 0.02645  0.1626346
#3 4.995 0.31205  0.5586144
#4 5.135 0.04805  0.2192031
#6 5.060 1.41120  1.1879394
#7 4.545 0.04205  0.2050610

Now, calculate the same statistics by groups of MES, one from 2 to 4 and the other 6 or 7.

grupo <- dados2$MES %in% 6:7
res2 <- tapply(dados2$RESP, grupo, FUN = estatisticas)
res2 <- do.call(rbind, res2)
row.names(res2) <- c(paste(2:4, collapse = "."), paste(6:7, collapse = "."))
res2
#         Media        Var DesvPadrao
#2.3.4 5.078333 0.08165667  0.2857563
#6.7   4.802500 0.57282500  0.7568520

Browser other questions tagged

You are not signed in. Login or sign up in order to post.