Select ID vectors with certain characteristics in R

Asked

Viewed 284 times

2

I have a data frame with four columns of values for each ID and need to create a new df excluding Ids whose vectors have more than one zero or more than one NA.

I created the DF

library(dplyr)
co_entidade<-c(23, 40, 58, 82, 104, 171,    198, 201, 202,244)
depend<-c(2,3,4,4,4,4,4,2,3,4)
mat13<-c(42,    218,    1397,   0,    393,    283, 1053,  529,    NA, 664)
mat14<-c(44,    222,    1300,   0,    428,    246,    994,    521,    NA, 678)
mat15<-c(40,    215,    1345,   199,    0,    226,    1069,   566,    NA, 598)
mat16<-c(10,    208,    1442,   154,    0,    229,    1033,    NA,    521,552)

df<-data.frame(co_entidade, depend, mat13, mat14, mat15, mat16)
df  

Matriculas 2013 a 2016 por entidade

I tried to apply a filter with the dplyr package that even removes the 0 and Nas, but the system returns the ids separated by year as shown below

desc_0_NA <- df %>% 
            gather(mat_tipo, mat_valor, mat13:mat16) %>%
            filter(mat_valor>0, mat_valor!="NA")
desc_0_NA

Resultado do comando

But, what I need is to remove the co_entity that presents more than a value 0 or NA, in this example I will have to obtain a df without the codes 82, 104 and 202, underlined in red in the image below. Since these vectors (82 and 104) have more than one zero or more than one NA (202).

IDs a serem filtrados

If anyone knows how to do it in the R, regardless of years where the zeroes or Nas are.

Thanks in advance

1 answer

2


First I created a function is.0() along the lines of is.na() to test whether the value of the cell is 0

is.0 <- function(x){x == 0}

Then I used the package functions dplyr

df <- df %>% 
  mutate(S.0 = rowSums(is.0(.), na.rm = T),
         S.NA = rowSums(is.na(.))) %>% 
  filter(S.0 <= 1) %>% 
  filter(S.NA <= 1) %>% 
  select(-contains("S."))
df

Browser other questions tagged

You are not signed in. Login or sign up in order to post.