How to create a loop that turns columns into variables and returns Shapiro.test at the end?

Question

How to create a loop that turns columns into variables and returns Shapiro.test at the end?

Asked 6 years, 11 months ago

Viewed 146 times

3

I have several . csv files with a high number of columns. I would like to optimize the work by creating a function that reads the columns and returns the result of the normality test (Shapiro.test) of each of them.

    data <- read.csv2("C:/Users/z/Desktop/CSVFOREST_WB.csv")

tnorm <- function(x){
  for (a in x) {
    a = x[[1,]]
    return(shapiro.test(a))

}  

                     }
tnorm(data)

The code, of course, returns error. What can I do?

1 answer

Browser other questions tagged r

You are not signed in. Login or sign up in order to post.

by Marcus Nunes • **17,915** points · Answer 1 · 2018-09-03T10:15:38+00:00

The R is not a very good language to use loops like for and while. Depending on the number of replications and their complexity, execution may become very slow.

However, it has some functions that facilitate the work of those who want to repeat the same calculation many times. Some of these functions are in the family *apply, as apply, sapply and lapply.

Take, for example, the data set below. It has 5 columns, each with 100 observations. All have normal distribution with mean 0 and standard deviation 1:

n <- 100 # tamanho amostral
r <- 5   # quantidade de amostras

dados <- data.frame(matrix(rnorm(n*r, mean=0, sd=1), ncol=5))

If I want to test the normality of each of the columns of this data set, just run

apply(dados, 2, shapiro.test)

in which

dados: is the data set
2: indicates that I will apply a function in each column of dados. If only I had 1, this function would be applied on the lines of dados
shapiro.test: indicates the function I will apply to each column (2 in the above item) dados

The result obtained is as follows::

$X1

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98757, p-value = 0.4773


$X2

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98678, p-value = 0.4228


$X3

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.95448, p-value = 0.001656


$X4

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98871, p-value = 0.5622


$X5

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98234, p-value = 0.2015

Note that in each column the Shapiro-Wilk test was applied and we obtained the statistical value and the p-value associated with it.