8
Suppose I have the following data set:
set.seed(12)
dados <- data.frame(grupos=rep(letters[1:5], 5), valores=rnorm(25))
head(dados)
grupos valores
1 a -1.8323176
2 b -0.0560389
3 c 0.6692396
4 d 0.8067977
5 e 0.2374370
6 a 0.7894452
How could I do to filter only lines whose groups are equal to a
or b
? I know how to filter lines equal to one level:
library(dplyr)
dados %>%
filter(grupos=="a")
grupos valores
1 a -1.8323176
2 a 0.7894452
3 a -0.9993864
4 a 0.3844801
5 a -1.3305330
dados %>%
filter(grupos=="b")
grupos valores
1 b -0.05603890
2 b 0.37511302
3 b -0.03578014
4 b 0.65215941
5 b 1.64394981
I could individually make each of the filters and add them together later. However, my original problem is more complicated, as it is a data frame with 26,691 lines, where I must filter 1,116 different values. It is impracticable to filter each of these values individually and then combine them at the end.
You tried: data %>% filter(groups=="a"|groups=="b")
– José
I tried yes. The problem is that there are 1116 different levels that interest me in my original dataset. To use this solution, I would have to write a code type
dados %>% filter(grupos=="a1"|grupos=="a2"|...|grupos=="a1116")
, which I find impractical.– Marcus Nunes
An alternative is to use regex, as long as it identifies a common pattern in the groups it wants to filter: Example: data<-data[stringr::str_which(data$groups,"(a|b)"),]. Or letters<-str_replace_all(toString(Letters[c(1,2)]),", s","|");data<-data[stringr::str_which(data$groups,letters),]
– José