Reproducing the data
dados <- structure(
list(
X = c("Ver_suj", "Ver_obj", "Substantivo", "Adjetivo" ),
Bolsonaro = c(59L, 299L, 988L, 653L),
Ciro = c(188L, 242L, 128L, 212L),
Manuela = c(59L, 66L, 1024L, 629L),
Marina = c(87L, 135L, 741L, 28L)
),
class = "data.frame", row.names = c(NA, -4L )
)
Since the latest version of dplyr, it is possible to use formula notation within mutate. So we have,
dados %>%
mutate_at(-1, ~.x/sum(.x))
# X Bolsonaro Ciro Manuela Marina
# 1 Ver_suj 0.02951476 0.2441558 0.03318335 0.08779011
# 2 Ver_obj 0.14957479 0.3142857 0.03712036 0.13622603
# 3 Substantivo 0.49424712 0.1662338 0.57592801 0.74772957
# 4 Adjetivo 0.32666333 0.2753247 0.35376828 0.02825429
What this "sentence" means is "Make a mutation in all columns minus the first column. This mutation will be divide each number by the sum of the numbers in the column".
The first bold part is determined by -1
in the function and the second is determined by the formula ~.x / (sum.x)
. In this formula .x
is a generic representation for each vector value (column)
Alternative
In the more traditional version of dplyr
the common would be to define a function that returns the percentages and use it in a mutate_at()
or mutate_if()
. Something like that:
percentual <- function(n) {
n / sum(n)
}
dados %>%
mutate_if(is.integer, percentual)
How about sharing the data with
dput(dados)
? See more here on how to improve the question.– Tomás Barcellos
Thank you, I’ll accomplish that
– user135517