R - How to turn values into percentages using Data.table?

Asked

Viewed 363 times

4

I own a data.table() with the columns Region, year and Quantity.

The amount is in total value, I would like to make them in percentage per year.

What do I have:

     Região  ano  Quantidade
      Norte  2017        252
   Nordeste  2017        281
    Sudeste  2017       2038
        Sul  2017       2246
Centro-Oeste 2017        525
   ...
   ...

What I want:

     Região  ano  Quantidade    % por ano
      Norte  2017        252          4.5
   Nordeste  2017        281            5
    Sudeste  2017       2038           36
        Sul  2017       2246           40
Centro-Oeste 2017        525          4.1
   ...
   ...

I’m trying to do this with the lapply()

Estab_regiao <- Estab_regiao[,Freq:=Quantidade/tapply(Quantidade, ano, FUN=sum)*100,]

But I’m making the following mistake:

Error in eval(jsub, SDenv, parent.frame()) : 
  dims [produto 8] não corresponde ao comprimento do objeto [40]

Dice:

 dput(Estab_regiao)
structure(list(Região = c("Norte", "Nordeste", "Sudeste", "Sul", 
"Centro-Oeste", "Norte", "Nordeste", "Sudeste", "Sul", "Centro-Oeste", 
"Norte", "Nordeste", "Sudeste", "Sul", "Centro-Oeste", "Norte", 
"Nordeste", "Sudeste", "Sul", "Centro-Oeste", "Norte", "Nordeste", 
"Sudeste", "Sul", "Centro-Oeste", "Norte", "Nordeste", "Sudeste", 
"Sul", "Centro-Oeste", "Centro-Oeste", "Norte", "Nordeste", "Sudeste", 
"Sul", "Centro-Oeste", "Norte", "Nordeste", "Sudeste", "Sul"), 
    ano = c(2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016, 
    2016, 2015, 2015, 2015, 2015, 2015, 2014, 2014, 2014, 2014, 
    2014, 2013, 2013, 2013, 2013, 2013, 2012, 2012, 2012, 2012, 
    2012, 2011, 2011, 2011, 2011, 2011, 2010, 2010, 2010, 2010, 
    2010), Quantidade = c(252L, 281L, 2038L, 2246L, 525L, 233L, 
    265L, 1952L, 2193L, 502L, 187L, 247L, 1881L, 2059L, 469L, 
    155L, 195L, 1808L, 1975L, 449L, 113L, 182L, 1758L, 1830L, 
    441L, 108L, 186L, 1746L, 1835L, 423L, 397L, 106L, 206L, 1697L, 
    1749L, 345L, 98L, 185L, 1653L, 1584L)), row.names = c(NA, 
-40L), class = c("data.table", "data.frame"))

1 answer

3


If I understand correctly you can do it in two steps:

# quantidade por região
d[, total := sum(Quantidade), by = Região]
# frequência
d[, Freq := Quantidade/total*100]
# rm columa
d[, total := NULL]
  • 1

    Change region per year * right there ! Thank you very much ! But there is no way to make a line?

  • Not that I know of.

  • 2

    Can be done in a line with d[, Freq := Quantidade/sum(Quantidade)*100, by = Região]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.