Add multiple selects in a single dataset

Asked

Viewed 60 times

1

I have the dataset below and I do a consolidation of the categories Mk_cap, Exports and Money_supply, but each of these grids has a different Unit.

df <- data.frame(Mes=c("Jan","Fev","Mar","Abr","Mai",
                   "Jan","Fev","Mar","Abr","Mai",
                   "Jan","Fev","Mar","Abr","Mai"),
             Ano=c(2005,2006,2007,2008,2009,
                   2005,2006,2007,2008,2009,
                   2005,2006,2007,2008,2009),
             Mk_Cap=c(11:15,116:120,1111:1115), 
             Exports=c(21:25,146:150,1351:1355),
             Money_Supply=c(31:35,546:550,2111:2115),
             Unit=c("USD","USD","USD","USD","USD","200=10",
                    "200=10","200=10","200=10","200=10",
                    "CNY","CNY","CNY","CNY","CNY"))

inserir a descrição da imagem aqui

Today I’m consolidating in the way below:

library(dplyr)
Money_Supply <- df %>% dplyr::select(Ano, Mes,Money_Supply) %>% dplyr::filter(df$Unit == "USD")
Mk_Cap <- df %>% dplyr::select(Mk_Cap) %>% dplyr::filter(df$Unit == "200=10")
Exports <- df %>% dplyr::select(Exports) %>% dplyr::filter(df$Unit == "CNY")

Consolidado <- base::cbind(Money_Supply,Mk_Cap,Exports)

inserir a descrição da imagem aqui

I believe that is not the most correct way to do this, but today is the way I found, in this example I passed has few occurrences, but in the practical case I do it in more than 30 variables what is extremely expensive, if you have some easier way would be the ideal.

2 answers

4


The problem is not the unit of each column, but rather that each unit requires a different column. The way is to map each unit to the required column and then join it all together.

library(dplyr)
library(purrr)

df <- data.frame(Mes=c("Jan","Fev","Mar","Abr","Mai",
                       "Jan","Fev","Mar","Abr","Mai",
                       "Jan","Fev","Mar","Abr","Mai"),
                 Ano=c(2005,2006,2007,2008,2009,
                       2005,2006,2007,2008,2009,
                       2005,2006,2007,2008,2009),
                 Mk_Cap=c(11:15,116:120,1111:1115), 
                 Exports=c(21:25,146:150,1351:1355),
                 Money_Supply=c(31:35,546:550,2111:2115),
                 Unit=c("USD","USD","USD","USD","USD","200=10",
                        "200=10","200=10","200=10","200=10",
                        "CNY","CNY","CNY","CNY","CNY"))


mapa <- list(
  "USD" = "Money_Supply",
  "200=10" = "Mk_Cap",
  "CNY" = "Exports"
)

sub_list <- mapa %>% 
  map2(
    .,
    names(.),
    ~df %>%
      filter(Unit == bquote(.(.y))) %>%
      select(Ano, Mes, bquote(.(.x)))
  )

purrr::reduce(sub_list, dplyr::left_join)
  • Gee, cool use of reduce. I always find it hard to imagine how to use it kkk

3

You can also group the data by month and year, take the values you want and then take the repeated columns. The code is less general but is closer to its initial attempt.

df <- data.frame(Mes=c("Jan","Fev","Mar","Abr","Mai",
                       "Jan","Fev","Mar","Abr","Mai",
                       "Jan","Fev","Mar","Abr","Mai"),
                 Ano=c(2005,2006,2007,2008,2009,
                       2005,2006,2007,2008,2009,
                       2005,2006,2007,2008,2009),
                 Mk_Cap=c(11:15,116:120,1111:1115), 
                 Exports=c(21:25,146:150,1351:1355),
                 Money_Supply=c(31:35,546:550,2111:2115),
                 Unit=c("USD","USD","USD","USD","USD","200=10",
                        "200=10","200=10","200=10","200=10",
                        "CNY","CNY","CNY","CNY","CNY"))



library(dplyr)

df %>% 
  group_by(Mes, Ano) %>% 
  mutate(
    Money_Supply = Money_Supply[Unit == "USD"],
    Mk_Cap = Mk_Cap[Unit  == "200=10"],
    Exports = Exports[Unit == "CNY"]
  ) %>% 
  select(-Unit) %>% 
  distinct()
#> # A tibble: 5 x 5
#> # Groups:   Mes, Ano [5]
#>   Mes     Ano Mk_Cap Exports Money_Supply
#>   <chr> <dbl>  <int>   <int>        <int>
#> 1 Jan    2005    116    1351           31
#> 2 Fev    2006    117    1352           32
#> 3 Mar    2007    118    1353           33
#> 4 Abr    2008    119    1354           34
#> 5 Mai    2009    120    1355           35

Created on 2021-03-08 by the reprex package (v1.0.0)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.