First, R is a vector programming language. It means the double cycle for
is not necessary. As each member of the lista1
is a vector, you can multiply the integer vector by 8.
for (i in 1:5) {
novalista[[i]] <- lista1[[i]]*8
}
And the code is already simpler and faster.
In the case of the question example, one can do even better.
The question is to apply a function to each member on the list lista1
, now that’s exactly what the lapply
ago. No need to index list members. Just apply the function directly to each of them. It is still necessary to take into account that the lapply
does not eliminate the cycle, the lapply
is also a form of cycle.
novalista2 <- lapply(lista1, '*', 8)
identical(novalista, novalista2)
#[1] TRUE
Here the function is multiplication, *
, with the extra argument 8
, the multiplier.
This gives code almost always simpler but contrary to what many R users think, it is not always faster. To test this I will use the package microbenchmark
. The two ways to create the novalista
are written in function form and then tested.
f <- function(lst){
novalista <- vector(mode = "list", length=length(lst))
for (i in seq_along(lst)) {
novalista[[i]] <- lst[[i]]*8
}
novalista
}
g <- function(lst) lapply(lst, '*', 8)
First with the short list of the question.
microbenchmark::microbenchmark(f(lista1), g(lista1), times = 1e4)
Now with a big list.
lista2 <- lapply(1:1000, function(i) sample(1000, 100))
microbenchmark::microbenchmark(f(lista2), g(lista2))
As can be seen, in both cases the for
was faster.
Thank you very much! My doubt was to understand how lapply "would understand" the priority that is "from within the loop" has. Because it only goes back to what it starts in i when it ends what it starts in k. How this step works in lapply?
– Laura