I calculate in R, data value with the mean and standard deviation of the column

Asked

Viewed 35 times

1

What I have to do is simple: take the value of the cell, subtract with the average of the values of the column, after dividing with the standard deviation of the values of the column.

Example:

  • Cell value = 2
  • Mean value of column = 1
  • Deviation of column = 0,5
  • Calculation = (2 - 1) / 0,5
  • Calculation = 2

Made in matrix form, but works only for first line:

teste <- data.frame(ANO = c(2011, 2012),
                            C1 = c(1,2),
                            C2 = c(3,4))

> teste
   ANO C1 C2
1 2011  1  3
2 2012  2  4

for (linha in 1:nrow(teste)) {
  for (coluna in 2:ncol(teste)) {
    teste[linha, coluna] = (teste[linha, coluna] - mean(teste[ , coluna])) / sd(teste[ , coluna])
  }
}

> teste
   ANO         C1         C2
1 2011 -0.7071068 -0.7071068
2 2012  0.7071068  0.7071068

I believe it tries better ways to solve this with R programming, and bring correct values.

1 answer

3

R base

Just apply the base function scale to each of the columns.

res <- teste
res[-1] <- lapply(res[-1], scale)
res
#   ANO         C1         C2
#1 2011 -0.7071068 -0.7071068
#2 2012  0.7071068  0.7071068

Bundle dplyr

teste %>% mutate(across(C1:C2, scale))
#   ANO         C1         C2
#1 2011 -0.7071068 -0.7071068
#2 2012  0.7071068  0.7071068

In response to commenting, in R base and with the package dplyr, instead of function scale can use an anonymous function.

res[-1] <- lapply(res[-1], function(x) (x - mean(x))/sd(x))

teste %>% mutate(across(C1:C2, function(x) (x - mean(x))/sd(x)))
  • If I were to use the standardization I put in the formula ? It would be like ?

  • 1

    @Diegowenceslau You can define a function f<-function(x) (x-mean(x)/sd(x) and call it as scale is called.

  • @Diegowenceslau Or with an anonymous function, see the edition.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.