Apply, sapply, mapply, lapply, vapply,rapply, tapply, replicate, Aggregate, by and correlates in R. When and how to use?

Asked

Viewed 4,769 times

16

What is the difference between the functions apply, sapply, mapply, lapply, vapply, rapply, tapply, replicate, aggregate, by and correlates in the R?

When and how to use each of them?

There are other packages that do something similar or can replace these functions?

  • See if this post can help you: http://stackoverflow.com/a/7141669/3096200 Abraço, Luiz

2 answers

14


Translating from here.

R has many *apply functions that are well explained in help (e. g. ?apply). As there are many, some new Usuarios may have difficulties deciding which one is appropriate for their situation or even remember all of them.

  • apply - When you want to apply the function to rows or columns of an array.

    # Matriz de duas dimensões
    M <- matrix(seq(1,16), 4, 4)
    
    # apply min às linhas
    apply(M, 1, min)
    [1] 1 2 3 4
    
    # apply min às colunas
    apply(M, 2, max)
    [1]  4  8 12 16
    
    # Array tridimensional
    M <- array( seq(32), dim = c(4,4,2))
    
    #  Aplicar soma em cada M [ * ], - isto é, através de Soma 2 ª e 3 ª dimensão
    apply(M, 1, sum)
    # O resultado é unidimensional
    [1] 120 128 136 144
    
    # Aplicar soma em cada M [ * , * ] - ou seja, através de Soma 3 ª dimensão
    apply(M, c(1,2), sum)
    # O resultado é bidimensional 
         [,1] [,2] [,3] [,4]
    [1,]   18   26   34   42
    [2,]   20   28   36   44
    [3,]   22   30   38   46
    [4,]   24   32   40   48
    
  • lapply - When you want to apply a function to each element of a list and get a list back.

    This is the flagship of many of the other functions *apply.

       x <- list(a = 1, b = 1:3, c = 10:100) 
       lapply(x, FUN = length) 
       $a 
       [1] 1
       $b 
       [1] 3
       $c 
       [1] 91
    
       lapply(x, FUN = sum) 
       $a 
       [1] 1
       $b 
       [1] 6
       $c 
       [1] 5005
    
  • sapply - When you want to apply the function to each element of a list, but want to return one vector instead of a list.

    Instead of using unlist(lapply(...)), consider the use of sapply.

       x <- list(a = 1, b = 1:3, c = 10:100)
       #Compare com acima; um vetor chamado , não uma lista
       sapply(x, FUN = length)  
       a  b  c   
       1  3 91
    
       sapply(x, FUN = sum)   
       a    b    c    
       1    6 5005 
    

    In more advanced uses of sapply the function will attempt to result in a multi-dimensional array if appropriate. For example, if our function returns vectors of the same length , sapply will use them as columns of an array:

       sapply(1:5,function(x) rnorm(3,x))
    

    If our function returns a 2-dimensional matrix, sapply will do essentially the same thing, treating each matrix as a single vector:

       sapply(1:5,function(x) matrix(x,2,2))
    

    Unless we specify simplify = "array", in which case it will use the individual matrices to build a multi-dimensional array:

       sapply(1:5,function(x) matrix(x,2,2), simplify = "array")
    
  • vapply - For when you want to use the sapply but maybe you need a code faster.

    For vapply, you basically give R an example of what kind of function will return, which can increase its performance.

    x <- list(a = 1, b = 1:3, c = 10:100)
    # Note que uma vez que o avanço aqui é principalmente a velocidade , este
    # Exemplo é apenas para ilustração. Estamos dizendo que R
    # Tudo voltou por length () deve ser um número inteiro de
    # Comprimento 1. 
    vapply(x, FUN = length, FUN.VALUE = 0) 
    a  b  c  
    1  3 91
    
  • mapply - For when you have several different data structures(e.g. vectors, lists) and you want to apply the function to the first elements of each and then the seconds, etc., forcing the result into a vector or array as in sapply.

    In this case your function must accept multiple arguments.

    #Soma os 1ºs elementos, os 2ºs elementos, etc. 
    mapply(sum, 1:5, 1:5, 1:5) 
    [1]  3  6  9 12 15
    #Para fazer rep(1,4), rep(2,3), etc.
    mapply(rep, 1:4, 4:1)   
    [[1]]
    [1] 1 1 1 1
    
    [[2]]
    [1] 2 2 2
    
    [[3]]
    [1] 3 3
    
    [[4]]
    [1] 4
    
  • rapply - For when you want to apply the function to each element of a nested list recursively.

    #Adiciona ! na string, ou incrementa
    myFun <- function(x){
        if (is.character(x)){
        return(paste(x,"!",sep=""))
        }
        else{
        return(x + 1)
        }
    }
    
    #Estrutura da lista
    l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"), 
              b = 3, c = "Yikes", 
              d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5)))
    
    
    #O resultado é um vetor ligado ao caractere         
    rapply(l,myFun)
    
    #O resultado é uma lista como l, porém com os valores alterados
    rapply(l, myFun, how = "replace")
    
  • tapply - For when you want to apply the function to subsectors of a vector and these are defined by another vector.

    A vector:

       x <- 1:20
    

    The factor (of the same size!) defining the groups:

       y <- factor(rep(letters[1:5], each = 4))
    

    Add the values in x in each subgroup defined by y:

       tapply(x, y, sum)  
        a  b  c  d  e  
       10 26 42 58 74 
    
    • Aggregate and by - It is relatively easy to collect data on R using one or more BY variables and a defined function.

attach(mtcars)
aggdata <-Aggregate(mtcars, by=list(cyl,vs),
FUN=Mean, na.rm=TRUE)
print(aggdata)
detach(mtcars)

  • 2

    Philip, I will give preference to an original answer (if it is so good and complete, of course), but if it does not appear you receive!

  • Very good! Congratulations Filipe!!

3

I think the best way to discover anything in R is to learn by experimentation, using embarrassingly trivial data and functions.

If you turn on your R console, type "apply" and scroll down to the functions in the base package, you’ll see something like this:

1: base::apply             aplicar aplicar funções sobre Margens de matriz 
2: base::by                aplicar uma função de um quadro de dados Dividido por Fatores
3: base::eapply            aplique uma função acima de valores em um ambiente
4: base::lapply            aplicar uma função sobre uma lista ou vetor
5: base::mapply            aplicar uma função para listar vários ou Argumentos vetoriais
6: base::rapply            aplicar recursively uma função a uma lista 
7: base::tapply            aplicar uma função sobre uma matriz Ragged

example using the eapply:

    # a new environment
    e <- new.env()
    # two environment variables, a and b
    e$a <- 1:10
    e$b <- 11:20
    # mean of the variables
    eapply(e, mean)
    $b
    [1] 15.5
    $a
    [1] 5.5

Source

Browser other questions tagged

You are not signed in. Login or sign up in order to post.