Consult Multiple Columns in R

Asked

Viewed 53 times

1

I have a DF and would like to consult multiple columns at the same time in an Auxiliary Table. Example of DF:

MATRICULA <- c(123,456,789,111,222,333)
DIA_1 <- c('ARROZ','','ARROZ','','ARROZ','')
DIA_2 <- c('FEIJAO','','FEIJAO','','','FEIJAO')
DIA_3 <- c('BATATA','','BATATA','','BATATA','BATATA')
DIA_4 <- c('COUVE','COUVE','','COUVE','','COUVE')
DIA_5 <- c('BIFE','','BIFE','','BIFE','BIFE')
DADOS <- data.frame(MATRICULA,DIA_1,DIA_2,DIA_3,DIA_4,DIA_5)

My Auxiliary Table

AUXILIAR <- c('ARRROZ','FEIJAO')
TABELA_AUX <- data.frame(AUXILIAR)

Therefore, I would like to consult the 5 columns at the same time (DIA_1 to DIA_5) to check if the plate consumed any food from the Auxiliary Table (rice OR ugliness).

As a result, I would like to see the following output:

MATRICULA <- c(123,456,789,111,222,333)
DIA_1 <- c('ARROZ','','ARROZ','','ARROZ','')
DIA_2 <- c('FEIJAO','','FEIJAO','','','FEIJAO')
DIA_3 <- c('BATATA','','BATATA','','BATATA','BATATA')
DIA_4 <- c('COUVE','COUVE','','COUVE','','COUVE')
DIA_5 <- c('BIFE','','BIFE','','BIFE','BIFE')
VERIFICACAO <- c('SIM','NAO','SIM','NAO','SIM','SIM') 
RESULTADO <- data.frame(MATRICULA,DIA_1,DIA_2,DIA_3,DIA_4,DIA_5,VERIFICACAO)

Thank you!

1 answer

2


There are several ways. Here one using the function apply, with a for and with the package dplyr.

I only posted the first one that I could get with the same data you sent. The others need the entries to be characters and not factors, which is easy to fix in the input or code and even makes more sense by the way the data is organized.

MATRICULA <- c(123,456,789,111,222,333)
DIA_1 <- c('ARROZ','','ARROZ','','ARROZ','')
DIA_2 <- c('FEIJAO','','FEIJAO','','','FEIJAO')
DIA_3 <- c('BATATA','','BATATA','','BATATA','BATATA')
DIA_4 <- c('COUVE','COUVE','','COUVE','','COUVE')
DIA_5 <- c('BIFE','','BIFE','','BIFE','BIFE')
DADOS <- data.frame(MATRICULA,DIA_1,DIA_2,DIA_3,DIA_4,DIA_5)


AUXILIAR <- c('ARROZ','FEIJAO')
TABELA_AUX <- data.frame(AUXILIAR)

#### USANDO A FUNÇÃO APPLY

DIAS <- c("DIA_1","DIA_2","DIA_3","DIA_4","DIA_5")

VERIFICACAO_1 <- apply(DADOS[,DIAS],1 , function(x){ #aplica a função nas linhas
  if (any(x %in% AUXILIAR)) { #testa se os valores estão na auxiliar
    "SIM"} else "NAO" #cria o vetor de verificação
})

RESULTADO_1 <- data.frame(DADOS,
                          "VERIFICACAO" = VERIFICACAO_1)

### usando um for
#Pra esse método e pro próximo os dados tem que tá em character e não em factor
#use stringASFactors = F na importação ou as.character() nas colunas

DADOS <- data.frame(MATRICULA,DIA_1,DIA_2,DIA_3,DIA_4,DIA_5, stringsAsFactors = F)

Numero_linhas <- dim(DADOS)[1]
VERIFICACAO_2 <- rep(0, Numero_linhas)

for(i in 1:Numero_linhas){
  VERIFICACAO_2[i] <-  ifelse(any(DADOS[i,DIAS] %in% AUXILIAR),
         "SIM", "NAO")
}


RESULTADO_2 <- data.frame(DADOS,
                          "VERIFICACAO" = VERIFICACAO_2)


### usando dplyr
#Se a entrada for character que nem a de cima
#Cria direto a tabela resultado

library(dplyr)

DIAS <- quo(c(DIA_1,DIA_2,DIA_3,DIA_4,DIA_5)) #permite que o dplyr entenda os nomes das colunas

RESULTADO_3 <- DADOS %>% 
  rowwise() %>% #faz a operação nas linhas
  mutate(VERIFICACAO = ifelse(any(!!(DIAS) %in% AUXILIAR),
                              "SIM", "NAO") #cria a coluna de verificação
  ) %>% 
  ungroup() #separa o grupos de linhas feitos por rowWise()


#testando se são iguais a referencia
VERIFICACAO <- c('SIM','NAO','SIM','NAO','SIM','SIM') 
VERIFICACAO == VERIFICACAO_1
VERIFICACAO == VERIFICACAO_2
VERIFICACAO == RESULTADO_3$VERIFICACAO
  • Dear Jorge, thank you! However, the way you put the "CHECK" column indicates that it was not built automatically (you already report the results). My problem would be how to automatically build the "CHECK" column. Also, I realized that your query sweeps all rows of the file, however I can’t do that, I have to limit only to these columns (DIA_1 until DIA_5). Could you help me again after these remarks? Thank you.

  • I edit with some modifications. Only I created the column automatically, using DATA and HELPER. At the end I only created a column with the name "CHECK" because it is a better name than "VERIFICACAO_1", but I did not use the example variable. I left it there just for testing.

  • I made the edits. I put a filter vector for the columns and left the test pro final, to make it clearer than the vector VERIFICACAO did not enter the table.

  • Thanks again. You said you have other ways besides this with apply. Could you show me another suggestion if possible? Thanks.

  • Okay, I did one more issue

Browser other questions tagged

You are not signed in. Login or sign up in order to post.