The R
is not a very good language to use loops like for
and while
. Depending on the number of replications and their complexity, execution may become very slow.
However, it has some functions that facilitate the work of those who want to repeat the same calculation many times. Some of these functions are in the family *apply
, as apply
, sapply
and lapply
.
Take, for example, the data set below. It has 5 columns, each with 100 observations. All have normal distribution with mean 0 and standard deviation 1:
n <- 100 # tamanho amostral
r <- 5 # quantidade de amostras
dados <- data.frame(matrix(rnorm(n*r, mean=0, sd=1), ncol=5))
If I want to test the normality of each of the columns of this data set, just run
apply(dados, 2, shapiro.test)
in which
dados
: is the data set
2
: indicates that I will apply a function in each column of dados
. If only I had 1
, this function would be applied on the lines of dados
shapiro.test
: indicates the function I will apply to each column (2
in the above item) dados
The result obtained is as follows::
$X1
Shapiro-Wilk normality test
data: newX[, i]
W = 0.98757, p-value = 0.4773
$X2
Shapiro-Wilk normality test
data: newX[, i]
W = 0.98678, p-value = 0.4228
$X3
Shapiro-Wilk normality test
data: newX[, i]
W = 0.95448, p-value = 0.001656
$X4
Shapiro-Wilk normality test
data: newX[, i]
W = 0.98871, p-value = 0.5622
$X5
Shapiro-Wilk normality test
data: newX[, i]
W = 0.98234, p-value = 0.2015
Note that in each column the Shapiro-Wilk test was applied and we obtained the statistical value and the p-value associated with it.