How to apply several functions to the same object?

Asked

Viewed 315 times

7

How is it possible, in the R, apply several functions to the same object?

Example:

Let’s say I have a vector x.

set.seed(123)
x <- rnorm(10)
x
# [1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774
# [6]  1.71506499  0.46091621 -1.26506123 -0.68685285 -0.44566197

And I need to apply a series of functions on this vector. Let’s say, just for example that these functions were: mean, standard deviation, minimum and maximum.

The most naive way to do this is to call each function separately.

mean(x)
# [1] 0.07462564
sd(x)
# [1] 0.9537841
max(x)
# [1] 1.715065
min(x)
# [1] -1.265061

How could I abstract these four called by just one who was more flexible?

Editing

The idea is that the applied functions can be defined on time, excluding the function min() or adding the function median(), for example, as needed at the time.

A map applies a function to several values. Doubt here is how to do otherwise, how to apply multiple functions to a value.

4 answers

7


The base function R Map can do what you want.
First I will redo the data, since I will also use a list of vectors, not just a list of functions.

set.seed(123)
x <- rnorm(10)
y <- x
is.na(y) <- sample(10, 3)

Now, the Map will apply various functions to vectors x and y, one at a time.

Map(function(f, x, ...){f(x, ...)}, list(mean, sd, median), list(x), na.rm = TRUE)
Map(function(f, x, ...){f(x, ...)}, list(mean, sd, median), list(y), na.rm = TRUE)

This can be written in a more general function, which allows as a second argument a list of vectors.

MapFuns <- function(fun.list, object, ...){
  if(is.list(object)){
    lapply(object, function(x)
      Map(function(f, x, ...){f(x, ...)}, fun.list, list(x), ...)
    )
  }else{
    Map(function(f, x, ...){f(x, ...)}, fun.list, list(object), ...)
  }
}

flist <- list(mean, sd, median)
MapFuns(flist, x)
MapFuns(flist, list(x, y))
MapFuns(flist, list(x, y), na.rm = TRUE)

And it also works with more complex objects.

df1 <- data.frame(A = rep(c("a", "b"), 5), X = rnorm(10))
df2 <- data.frame(A = rep(c("a", "b"), 5), X = rnorm(10))

groupMean <- function(DF){
  tapply(DF[[2]], DF[[1]], mean, na.rm = TRUE)
}

groupMean2 <- function(DF){
  aggregate(DF[[2]], by = list(DF[[1]]), mean, na.rm = TRUE)
}

MapFuns(list(groupMean, groupMean2), list(df1, df2))

5

Within the it is possible invoke various functions with invoke_map() and its variants.

The invoke_map has three basic components:

  1. .f: A list of the functions (the same objects and not their names as string) who will be invoked
  2. .x: A list of the objects that will be passed to each function. Case .x be a size one list, it is recycled and all functions are applied to it.
  3. ...: Other arguments that can be passed to the functions.

That way we have, for example

set.seed(123)
x <- rnorm(10)
funcoes <- list(media = mean, maximo = max)

library(purrr)

invoke_map(funcoes, list(x))
# $`media`
# [1] -0.5604756

# $maximo
# [1] 1.715065

In this sense, invoke() is the opposite of map().

The type functions invoke_map_* has the same variations as map_* who coerce that guy out. Thus it is possible to convert the result into a numerical vector or into a table, for example, as follows.

invoke_map_dbl(funcoes, list(x))
#      media     maximo 
# -0.5604756  1.7150650 

invoke_map_df(funcoes, list(x))
# # A tibble: 1 x 2
#    media maximo
#    <dbl>  <dbl>
# 1 -0.560   1.72

Using the vector proposed by @Rui, it is possible to demonstrate two ways to pass the arguments to invoke_map(). In the first of them, the argument na.rm is included via .... In the second it is offered within the list of arguments that will be passed, ie in the .x.

y <- x
is.na(y) <- sample(10, 3)

invoke_map(funcoes, list(y), na.rm = TRUE)
# $`media`
# [1] -0.5604756
# 
# $maximo
# [1] 1.558708

invoke_map(funcoes, list(list(x = y, na.rm = TRUE)))
# $`media`
# [1] -0.1061246
# 
# $maximo
# [1] 1.558708
  • Why did you ask the question if you already knew the answer?

  • 1

    Why other people may have doubt. So we are producing material on R in Portuguese for future readers

  • I understand, but then it’s important that you explain that in your question and give your answer within the question.

  • 1

    I wouldn’t have responded if someone had introduced the purrr. I know the community has people who know the package better than I do and could have answered the question

  • 1

    Moreover, this answer only complements the others with another approach

  • Still, you could have given your answer and explained in the question how you can improve it.

  • 3

    @Jdemello Recommended Reading: Meta Discussion on Answer your own question

Show 2 more comments

5

Just create a function using the function combine (c):

fun1<-function(x){
    c(mean=mean(x),sd=sd(x),min=min(x),max=max(x))
}

To apply this after function on the desired object:

fun1(x)
       mean          sd         min         max 
 0.07462564  0.95378405 -1.26506123  1.71506499 

Important point:

And if the object exists missings (NA)? In this situation, such a function would not apply (even anonymous functions would not work):

y<-c(2,3,NA,5,6,7,8,NA)

mean(y)
[1] NA
sd(y)
[1] NA
min(y)
[1] NA
max(y)
[1] NA

fun1(y)
mean   sd  min  max 
NA   NA   NA   NA

When this occurs, one should insert the argument na.rm=TRUE.

mean(y,na.rm=TRUE)
[1] 5.166667

Within the function, adjustment could be done so:

fun2<-function(x){
  funs=c(mean=mean,sd=sd,min=min,max=max)
         lapply(funs,function(f)f(x,na.rm=TRUE))
}

fun2(y)    

$`mean`
[1] 5.166667

$sd
[1] 2.316607

$min
[1] 2

$max
[1] 8

Or, use sapply to emit a vector as a result:

fun3<-function(x){
  funs=c(mean=mean,sd=sd,min=min,max=max)
         sapply(funs,function(f)f(x,na.rm=TRUE))
}

fun3(y)
mean       sd      min      max 
5.166667 2.316607 2.000000 8.000000 
  • Thanks for the answer, but I will edit the question. I apparently did not make it clear that the functions listed are just examples. The idea is that functions enter as argument of function.

  • What I desire is a Function Operator

4

We can create a function using as argument an Ellipsis (...).

set.seed(123)
x <- rnorm(10)

myFun <- function(x, ...){
  funs <- list(...)
  res <- vector("list", length = length(funs))
  for(i in seq_along(funs)){
    f <- match.fun(funs[[i]])
    res[[i]] <- f(x)
    rm(f)
  }
  rm(i)
  names(res) <- unlist(funs)
  return(res)
}

An example:

> myFun(x, "mean", "median", "sd", "max", "min", "round")
$`mean`
[1] 0.07462564

$median
[1] -0.07983455

$sd
[1] 0.9537841

$max
[1] 1.715065

$min
[1] -1.265061

$round
 [1] -1  0  2  0  0  2  0 -1 -1  0

In myFun, the argument ... can receive several arguments within it. In order to use the arguments in ..., create a list of such arguments in ...:

myFun <- function(x, ...){
  funs <- list(...) # passe para uma lista
  return(funs)
}

See that in this partial version of myFun, the object to be returned will be a list of the arguments given in ...:

> myFun(x, "a", "b", mean)
[[1]]
[1] "a"

[[2]]
[1] "b"

[[3]]
function (x, ...) 
UseMethod("mean")
<bytecode: 0x000000000cc43678>
<environment: namespace:base>

Now the problem is to use these objects in the list to do the necessary operations. In case of your question, you want to do operations in x based on the arguments given in ... which in turn are functions. This is done with the function match.fun():

Description

When called Inside functions that take a Function as argument, Extract the desired Function Object while avoiding undesired matching to Objects of other types.

This function is meant to be used within another function, since:

match.fun is not intended to be used at the top level Since it will perform matching in the Parent of the Caller.

na.rm = T

If you want to use optional arguments in the functions contained in ... as na.rm = T, we can build a condition on if inside the loop:

myFun <- function(x, ..., na.rm = F){
  funs <- list(...)
  res <- vector("list", length = length(funs))
  for(i in seq_along(funs)){
    f <- match.fun(funs[[i]])
    res[[i]] <- try(f(x, na.rm = na.rm), silent = T) # executar função com argumento na.rm
    if(inherits(res[[i]], "try-error")) res[[i]] <- f(x) # em caso de erro, re execute-a sem o argumento na.rm...
    rm(f)
  }
  rm(i)
  names(res) <- unlist(funs)
  return(res)
}

x <- c(x, NA)

Upshot:

> myFun(x, "mean", "round")
$`mean`
[1] NA

$round
 [1] -1  0  2  0  0  2  0 -1 -1  0 NA

> myFun(x, "mean", "round", na.rm = T) # com na.rm =T 
$`mean`
[1] 0.07462564

$round
 [1] -1  0  2  0  0  2  0 -1 -1  0 NA

Browser other questions tagged

You are not signed in. Login or sign up in order to post.