Replacing for the Lapply function in R

Asked

Viewed 116 times

4

This is my list

lista1<-list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10), c(11, 12, 13, 14, 15
), c(16, 17, 18, 19, 20), c(21, 22, 23, 24, 25))

I build a new list the way below.

novalista <- vector(mode = "list", length=5)

for (i in 1:5) {
  for(k in 1:5) {     
    novalista[[i]][k] <- lista1[[i]][k]*8
  }
}

novalista

Estu wants to make a simpler/cleaner code using the lapply function. That is, I want to replace these two "for s" with the lapply function.

The difficulty is to use the lapply function with two different indices (i and k)

Some help?

2 answers

4

First, R is a vector programming language. It means the double cycle for is not necessary. As each member of the lista1 is a vector, you can multiply the integer vector by 8.

for (i in 1:5) {
    novalista[[i]] <- lista1[[i]]*8
}

And the code is already simpler and faster.

In the case of the question example, one can do even better.
The question is to apply a function to each member on the list lista1, now that’s exactly what the lapply ago. No need to index list members. Just apply the function directly to each of them. It is still necessary to take into account that the lapply does not eliminate the cycle, the lapply is also a form of cycle.

novalista2 <- lapply(lista1, '*', 8)

identical(novalista, novalista2)
#[1] TRUE

Here the function is multiplication, *, with the extra argument 8, the multiplier.

This gives code almost always simpler but contrary to what many R users think, it is not always faster. To test this I will use the package microbenchmark. The two ways to create the novalista are written in function form and then tested.

f <- function(lst){
    novalista <- vector(mode = "list", length=length(lst))
    for (i in seq_along(lst)) {
        novalista[[i]] <- lst[[i]]*8
    }
    novalista
}

g <- function(lst) lapply(lst, '*', 8)

First with the short list of the question.

microbenchmark::microbenchmark(f(lista1), g(lista1), times = 1e4)

Now with a big list.

lista2 <- lapply(1:1000, function(i) sample(1000, 100))
microbenchmark::microbenchmark(f(lista2), g(lista2))

As can be seen, in both cases the for was faster.

  • Thank you very much! My doubt was to understand how lapply "would understand" the priority that is "from within the loop" has. Because it only goes back to what it starts in i when it ends what it starts in k. How this step works in lapply?

1

Execute:

lista1<-list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10), c(11, 12, 13, 14, 15
), c(16, 17, 18, 19, 20), c(21, 22, 23, 24, 25))

novalista<-lapply(lista1,'*',8)
novalista

#[[1]]
#[1]  8 16 24 32 40

#[[2]]
#[1] 48 56 64 72 80

#[[3]]
#[1]  88  96 104 112 120

#[[4]]
#[1] 128 136 144 152 160

#[[5]]
#[1] 168 176 184 192 200

Browser other questions tagged

You are not signed in. Login or sign up in order to post.