Lapply does not return the desired result for some functions

Asked

Viewed 58 times

2

My list:

structure(list(col1 = structure(list(a = 1:5, b = 1:5, c = 1:5), .Names = c("a", 
"b", "c"), row.names = c(NA, -5L), class = "data.frame"), col2 = structure(list(
    a = 6:10, c = 6:10), .Names = c("a", "c"), row.names = c(NA, 
-5L), class = "data.frame"), col3 = structure(list(a = 11:15, 
    c = 11:15), .Names = c("a", "c"), row.names = c(NA, -5L), class = "data.frame"), 
    col4 = structure(list(a = 16:20, b = 16:20), .Names = c("a", 
    "b"), row.names = c(NA, -5L), class = "data.frame"), col5 = structure(list(
        a = 21:25, c = 21:25), .Names = c("a", "c"), row.names = c(NA, 
    -5L), class = "data.frame")), .Names = c("col1", "col2", 
"col3", "col4", "col5"))

I tried to:

res<-lapply(list,function(x)colSums(subset(x,select=c('a'))))

and

res<-lapply(list,function(x)colMeans(subset(x,select=c(1,2))))

and the result was ok.

But when I do:

res<-lapply(list,function(x)shapiro.test(subset(x,select=c(1,2))))

do not succeed (Error: is.numeric(x) is not TRUE).

What to do?

  • 1

    This error arises simply because the Shapiro test is only applicable to numerical vectors, not data.frames. From the help page, "x a Numeric vector of data values." The normality test only makes sense for vectors, each vector da df may or may not follow a normal distribution. As you say, for other cases, colSums and colMeans it’s all right.

1 answer

4


As very well pointed out by Rui’s comment, the function shapiro.test is only set to vectors. But nothing prevents us from creating a version for it that can be applied in columns of data frames:

shapiro.test.df <- function(df){
  apply(df, 2, shapiro.test)
}

The function shapiro.test.df was created simply by applying the function itself shapiro.test in the columns of any data frame. Now just use lapply to apply it to the elements of a list, provided that these elements are data frames:

lapply(dados, shapiro.test.df)

$col1
$col1$a

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672


$col1$b

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672


$col1$c

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672



$col2
$col2$a

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672


$col2$c

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672



$col3
$col3$a

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672


$col3$c

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672



$col4
$col4$a

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672


$col4$b

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672



$col5
$col5$a

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672


$col5$c

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.98676, p-value = 0.9672

Browser other questions tagged

You are not signed in. Login or sign up in order to post.