When to use the.call function?

Asked

Viewed 907 times

4

Dieferenmente de lapply, do.call applies a function across a list (which is also a data.frame). Consider the loop for down below:

set.seed(123)

for (i in 1:6) {
  assign(paste('var', i, sep = '_'), 
         runif(30, 20, 100))
}

I can get do.call to transform these vectors into a data.frame:

data_1 <- do.call(
  cbind.data.frame, 
  mget(ls(pattern = '*v'))
)

But, this does not make sense, because the function itself (be it cbind.data.frame, sum, etc.) would do the same without the need to apply do.call. For example:

data_2 <- cbind.data.frame(mget(ls(pattern = '*v')))

Now for the sum:

do.call(sum, data_2)
[1] 10826.89

sum(data_2)
[1] 10826.89

I ask you:

  • In what context the function do.call would become indispensable? Why?
  • 1

    I haven’t been able to stop to answer, but the key idea is to make several calls in a row

  • 1

    A classic example is do.call and rbind for a list of results. I remember some comparisons that showed that the do.call was faster.

1 answer

4


The do.call should be used when you want to pass a list of arguments to a function as opposed to passing a list as argument to the function.

That is, to call do.call(rbind, lista) is the same as calling rbind(lista[[1]], lista[[2]], ..., lista[[n]]) which in turn is different from calling rbind(lista) (see at the end).

In addition to the do.call can gain performance compared to other options.

library(tidyverse)

n <- 10 ^ 4 # 10 mil

gerar_range <- function(...) {
  valores <- range(rnorm(100))
  tibble(min = valores[1], max = valores[2])
}

lista <- map(seq_len(n), gerar_range)

microbenchmark::microbenchmark(
  do.call = do.call(rbind, lista),
  rbind = {
    res <- lista[[1]]
    for (i in lista[-1]) {
      res <- rbind(res, i)
    }
    res
  }
)
#> Unit: milliseconds
#>     expr       min        lq     mean    median        uq       max neval cld
#>  do.call  330.6749  354.1027  368.621  365.3997  381.5701  428.9259   100  a
#>    rbind 2611.6736 2727.6387 2828.675 2806.4544 2890.7625 3159.1669   100   b

Created on 2019-03-11 by the reprex package (v0.2.1)

See also that on account of the checked at the beginning, call do.call(rbind, lista) and rbind(lista) can have different results and therefore cannot/should be compared. Example:

rbind(lista[1:10])
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10] 
[1,] List,2 List,2 List,2 List,2 List,2 List,2 List,2 List,2 List,2 List,2

do.call(rbind, lista[1:10])
# A tibble: 10 x 2
     min   max
   <dbl> <dbl>
 1 -2.67  2.24
 2 -2.36  2.24
 3 -2.84  2.82
 4 -2.30  2.05
 5 -3.01  2.73
 6 -2.07  2.75
 7 -2.21  2.76
 8 -3.13  2.14
 9 -2.32  1.92
10 -3.45  2.70

Browser other questions tagged

You are not signed in. Login or sign up in order to post.