The reason these functions still exist today is historical.
The dplyr
was thinking about who program interactively, so it provides some facilities to those who are programming as:
- not having to use quotes to put the variables name
- do not use the data.frame name all the time.
Examples:
library(dplyr)
mtcars <- mtcars %>%
mutate(cyl = cyl*2)
mtcars$cyl <- mtcars$cyl * 2
This makes the user more productive at the time they are programming. What makes this possible is the use of what is called Non Standard Evaluation (NSE). NSE makes programming interactively more enjoyable, but makes code more complicated when you want to create more general functions.
For example, it’s not very intuitive how to do in dplyr
if you want the name of a new variable to come from the value of a variable:
> variavel <- "cyl"
>
> mtcars %>%
+ mutate(variavel = variavel *2)
Error in variavel * 2 : non-numeric argument to binary operator
The dplyr
does not use the value cyl
variable. Actually it is trying to create a new variable called variavel
whose value will be "cyl" * 2
and it makes a mistake.
Over time, the authors of dplyr
proposed several ways to solve this problem. One of them was to include the equivalent functions but with a _
in the end.
These functions used to Standard Evaluation and therefore were useful when creating their own functions, which is called program with dplyr
.
See how weird the syntax was using the mutate_
for example:
variavel <- "cyl"
mtcars %>%
mutate_(
.dots =
list(lazyeval::interp(~ 2*(var), var = as.name(variavel))) %>% setNames(variavel)
)
However now the dplyr
uses a concept called Tidy Evaluation and this is now the recommended way to program with dplyr
. Example:
variavel <- "cyl"
mtcars %>%
mutate(!!sym(variavel) := 2*!!sym(variavel))
In short, answering your questions:
- They were useful for programming with
dplyr
.
- Should not be used. The recommended way is to use Tidy Evaluation.
You can find here the version of Vignette which introduced that concept.
Nowadays the documentation of dplyr
on these functions says:
dplyr used to Offer twin versions of each Verb suffixed with an
underscore. These versions had standard Evaluation (SE) Semantics:
rather than taking Arguments by code, like NSE Verbs, they Took
Arguments by value. Their purpose was to make it possible to program
with dplyr. However, dplyr now uses Tidy Evaluation Semantics. NSE
Verbs still capture their Arguments, but you can now unquote Parts of
These Arguments. This offers full programmability with NSE Verbs.
Thus, the underscored versions are now superfluous.