How to use filter_functions?


Viewed 243 times


I try to use the functions filter_ (all, at, if), but unsuccessfully, mainly for strings. Consider the data set below:


data_1 <- data.frame(
  a = c(paste('group', 1:6, sep = '_')), 
  b = c(paste('new', 1:6, sep = '_')), 
  d = c(rnorm(6, 10, 1))


  • How to filter, at once, everything that contains the particle 1? (filter_all)

  • How to filter, at once, all that contains 1 and 3 in the variables a and b? (filter_at)

  • How to filter, at once, all that contains 1 and 3 in the variables a and b and all that is greater (>) that 10 in the variable d? (filter_at)

  • How to filter everything that is character, if it contains the particles 1 and 3? (filter_if)

Little sketch of what I tried:


filter_at(data_1, c('a', 'b'), any_vars('1'))

Error: No tidyselect variables Were Registered

I tried to filter the variables a and b, but it didn’t work out.

I never used the function filter with these suffixes, so doubt.

1 answer


First of all I will redo the data with set.seed to make results reproducible and with the argument stringsAsFactors = FALSE, to answer the last question.


data_1 <- data.frame(
  a = c(paste('group', 1:6, sep = '_')), 
  b = c(paste('new', 1:6, sep = '_')), 
  d = c(rnorm(6, 10, 1)),
  stringsAsFactors = FALSE

On the issues, I will also make a small change to the way you have tried to solve the problems, I will use the Pipes %>%.

Common to all problems will be the use of grepl, once the columns a and b are class "character".

Apparently easier. But it is not completely clear if you only want the lines where it occurs '1' in all they or in some theirs.


data_1 %>%
   filter_all(all_vars(grepl('1', .)))
#        a     b        d
#1 group_1 new_1 8.792934

data_1 %>%
   filter_all(any_vars(grepl('1', .)))
#        a     b         d
#1 group_1 new_1  8.792934
#2 group_2 new_2 10.277429
#3 group_3 new_3 11.084441
#4 group_5 new_5 10.429125
#5 group_6 new_6 10.506056

This question is simpler. It is solved with grepl applied to the pronoun '.'.

data_1 %>%
  filter_at(vars(a, b), any_vars(grepl('1|3', .)))
#        a     b         d
#1 group_1 new_1  8.792934
#2 group_3 new_3 11.084441

Now it will be a composite logical condition.

data_1 %>%
  filter_at(vars(a, b, d), 
            all_vars(grepl('1|3', a) & grepl('1|3', b) & d > 10))
#        a     b        d
#1 group_3 new_3 11.08444

Finally the filter_if. Here too the problem of being able to be all_vars or any_vars. By chance the results are the same.

data_1 %>%
   filter_if(~ is.character(.), all_vars(grepl('1', .)))
#        a     b        d
#1 group_1 new_1 8.792934

data_1 %>%
   filter_if(~ is.character(.), any_vars(grepl('1', .)))
#        a     b        d
#1 group_1 new_1 8.792934
  • Great. Thank you, Rui.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.