Percentage Frequency in R with dplyr

Asked

Viewed 1,675 times

1

I wanted to use the dplyr package to calculate the Relative Frequency by group. I have a database like the first three columns below and I would like the last column to be the answer variable:

CNPJ    Central             depositos   Resultado final
315406  SICOOB CECRESP      4,61E+13    97,78%
512839  SICOOB CECRESP      1,05E+12    2,22%
68987   SICOOB CREDIMINAS   5,22E+13    33,00%
429890  SICOOB CREDIMINAS   3,88E+13    24,54%
803287  SICOOB CREDIMINAS   3,82E+13    24,15%
804046  SICOOB CREDIMINAS   2,90E+13    18,31%
694877  SICOOB PLANALTO CENTRAL 5,01E+13    100,00%
694389  SICOOB SC/RS        8,75E+13    67,28%
707903  SICOOB SC/RS        4,25E+13    32,72%

Any suggestions? I don’t know much about the dplyr package but I made some frustrated attempts like:

dados <- dados %>% 
  group_by(CENTRAL, depositos) %>%
  summarise(value = sum(value)) %>%
  mutate(csum = cumsum(value))

And the Relative Frequency Accumulated by CENTRAL?

2 answers

3


You can try it:

dados %>% 
    group_by(Central, depositos) %>% 
    mutate(freq_relat=Resultado/sum(Resultado)) %>%  
    mutate(freq_relat=round(freq_relat*100, 2))
  • Rafael, what would you consider to do the Cumulative Relative Frequency?

  • Hi Veiga, I would follow in the pipe and use the cumsum : mutate(freq_cum=cumsum(value)/sum(value)) .

0

Just to give feedback, from Rafael’s programming, the programming that calculated the relative frequency was:

  dados<-dados %>% 
     group_by(CENTRAL) %>% 
     mutate(freq_relat=depositos/sum(depositos)) %>%  
     mutate(freq_relat=round(freq_relat*100, 2))

Browser other questions tagged

You are not signed in. Login or sign up in order to post.