Indicator on R with more than one condition with duplicate values

Asked

Viewed 133 times

1

Suppose there is a basis with two variables as follows:

Município   IF
RIOBOM  Cooperativa
RIOBOM  Cooperativa
ABADIA  Múltiplo
ABADIA  Múltiplo
ABADIA  Cooperativa
ABADIA  Banco
DOURADOS    Banco
DOURADOS    Múltiplo
DOURADOS    Banco
DOURADOS    Cooperativa
DOURADOS    Múltiplo

How to create an indicator that marks only those municipalities that have only "cooperative" and "bank" and do not have "multiple". Resulting in the following basis:

Município   IF  Indicador
RIOBOM  Cooperativa 0
RIOBOM  Cooperativa 0    
ABADIA  Múltiplo    0
ABADIA  Múltiplo    0
ABADIA  Cooperativa 0
ABADIA  Banco   0
DOURADOS    Banco   0
DOURADOS    Cooperativa    1
DOURADOS    Banco   0

I asked a similar question but only with a conditional one, and the solution found was with the average of the indicator grouped:

Indicator in variable-conditioned R with duplicate values

  • 1

    I was a little confused by your question. The municipality of DOURADOS should not have the marker 1 in all lines?

  • The same one for the county of RIOBOM.

2 answers

3

If the problem description is correct and the expected result example is not, the following code solves the question.

i1 <- grepl("Cooperativa|Banco", dados$IF, ignore.case = TRUE)
i2 <- !grepl("Múltiplo", dados$IF, ignore.case = TRUE)
dados$Indicador <- ave(i1 & i2, dados$Município, FUN = all) + 0L

dados
#   Município          IF Indicador
#1     RIOBOM Cooperativa         1
#2     RIOBOM Cooperativa         1
#3     ABADIA    Múltiplo         0
#4     ABADIA    Múltiplo         0
#5     ABADIA Cooperativa         0
#6     ABADIA       Banco         0
#7   DOURADOS       Banco         0
#8   DOURADOS    Múltiplo         0
#9   DOURADOS       Banco         0
#10  DOURADOS Cooperativa         0
#11  DOURADOS    Múltiplo         0
  • From what I understand, the Municipality of ABADIA would have the indicator 0 because it has Multiple. I think he wants the Indicator to be 1 for municipalities that have only (and both) Cooperative and Bank

  • @Rafaelcunha Thank you, corrected.

2

As Rui said, his source database is different from the database with the expected result. Also, I had a different understanding because I think the municipality of RIOBOM would have the indicator 0 for he only possesses Cooperativa. Thus, it follows code that responds to such a problem:

df2 <- structure(list(Município = structure(c(3L, 3L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L), .Label = c("ABADIA", "DOURADOS", "RIOBOM"), class = "factor"), 
IF = structure(c(2L, 2L, 3L, 3L, 2L, 1L, 1L, 1L, 2L), .Label = c("Banco", 
"Cooperativa", "Múltiplo"), class = "factor")), .Names = c("Município", 
"IF"), class = "data.frame", row.names = c(NA, -9L))

library(dplyr)

df2 %>% 
  group_by(Município) %>% 
  mutate(Indicador = ifelse( (any(IF == "Cooperativa") & any(IF == "Banco") & !any(IF == "Múltiplo")), 1, 0))

Response based on this doubt of OS.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.