With R base, you can use aggregate
. For more than one function, you can join the result of different aggregations:
agg <- merge(
aggregate(df1[3], by = df1[1:2], length),
aggregate(df1[3], by = df1[1:2], sum, na.rm = TRUE),
by = c("origem", "destino"))
names(agg)[3:4] <- c("num_total", "valor_total")
agg
#> origem destino num_total valor_total
#> 1 A A 1 1
#> 2 A B 2 2
#> 3 B A 2 3
#> 4 B B 1 -1
#> 5 B C 1 5
Or concatenate several functions with c
:
agg <- aggregate(df1$valor, by = df1[1:2], function(x) c(num_total = length(x), valor_total = sum(x, na.rm = TRUE)))
As pointed out by @Rui-Barradas in the comments, concatenation will result in a variable x
containing a matrix with the results. To have a data frame. only with vector columns:
agg <- cbind(agg[-length(agg)], agg[[length(agg)]])
agg
#> origem destino num_total valor_total
#> 1 A A 1 1
#> 2 B A 2 3
#> 3 A B 2 2
#> 4 B B 1 -1
#> 5 B C 1 5
Packages like dplyr (used in the reply by Marcus Nunes) facilitate operations per group. Another option is data table.:
library(data.table)
setDT(df1)
df1[, .(num_total = .N, valor_total = sum(valor, na.rm = TRUE)), .(origem, destino)]
#> origem destino num_total valor_total
#> 1: A A 1 1
#> 2: A B 2 2
#> 3: B A 2 3
#> 4: B C 1 5
#> 5: B B 1 -1
I voted but one note: if you do
str(agg)
will see that the variablex
is a matrix, the result ofrbind
of the vectors produced by the functionc()
. An option to have a single df with column vectors isagg<-aggregate(.)
followed bycbind(agg[-length(agg)], agg[[length(agg)]])
. Here is the methodcbind.data.frame
which is called, sinceagg[-length(agg)]
is a df.– Rui Barradas
Thanks, I had not noticed this detail. I will edit the ASAP response.
– Carlos Eduardo Lagosta