Error with lapply function

Asked

Viewed 73 times

2

I try to perform the following function:

result<-lapply(mylist,function(x)cbind(x,var=tapply(x[,c(3)],x[,c(1)],sum)))

But this mistake arises:

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 340623, 63073

I need to return the sum within the dataframes to my list. What’s the problem with the formula?

Unfortunately I cannot share the data to help the answer

1 answer

2


Without the data it becomes very complicated to replicate the problem you are encountering. Using the structure you have already posted in other questions

mylist
[[1]]
     number group      sexo
1  26.12186     a Masculino
2  40.39104     a Masculino
3  29.29426     a Masculino
4  45.11651     b  Feminino
5  26.72512     b Masculino
6  45.95550     b Masculino
7  47.56538     c  Feminino
8  43.14062     c  Feminino
9  47.42608     c Masculino
10 23.57519     c  Feminino

[[2]]
     number group      sexo
1  47.64770     a Masculino
2  22.61412     a  Feminino
3  48.37883     a Masculino
4  48.44754     b Masculino
5  41.67047     b  Feminino
6  23.74823     b Masculino
7  28.82786     c Masculino
8  30.12309     c  Feminino
9  27.12305     c Masculino
10 49.58259     c  Feminino
11 40.21284     d Masculino
12 40.57279     d  Feminino
13 48.33335     d Masculino
14 22.92160     d Masculino
15 25.07216     e Masculino

I’ve been doing what you want step-by-step. While running the function tapply you will get a array with two columns and a row, which will be the sum of the values by sex:

tapply(mylist[[1]][,1], mylist[[1]][,3], sum)
 Feminino Masculino 
 159.3977  215.9139

That’s why the error is appearing when running the command cbind he tries to concatenate a data.frame with a number of lines other than the result of tapply.

To get around this problem and understand that what you want is to put the sum of the sexes value as a new variable, you can base yourself on the following code:

teste <- mylist[[1]]
teste1 <- tapply(teste[,1], teste[,3], sum)
teste2 <- tidyr::gather(data.frame(teste1), key = "sexo")
teste2$sexo <- names(teste1)

dplyr::left_join(teste, teste2)
Joining, by = "sexo"
 number group      sexo    value
1  26.12186     a Masculino 215.9139
2  40.39104     a Masculino 215.9139
3  29.29426     a Masculino 215.9139
4  45.11651     b  Feminino 159.3977
5  26.72512     b Masculino 215.9139
6  45.95550     b Masculino 215.9139
7  47.56538     c  Feminino 159.3977
8  43.14062     c  Feminino 159.3977
9  47.42608     c Masculino 215.9139
10 23.57519     c  Feminino 159.3977

I just circled the first data.frame from the list just to try to understand the problem.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.