How to split the dataframes of a list based on a group variable, common in all of them?

Asked

Viewed 639 times

6

I have a list with n dataframes. They have a common variable, called group. I want to analyze the dataframes only with the groups a and c of group.

My goal: to return, within the list, these dataframes only with these selected lines, based on group.

Following example reproducible (dput) for assistance in response:

mylist=list(structure(list(number = c(26.1218564352021, 40.3910428239033, 
29.2942556878552, 45.1165094505996, 26.7251219204627, 45.9554967121221, 
47.5653750263155, 43.1406186055392, 47.4260813184083, 23.5751851135865
), group = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L
), .Label = c("a", "b", "c"), class = "factor")), class = "data.frame", 
row.names = c(NA, 
-10L)), structure(list(number = c(47.6476960512809, 22.61412369553, 
48.3788266661577, 48.4475369821303, 41.6704738186672, 23.7482307478786, 
28.8278631540015, 30.1230939105153, 27.1230523264967, 49.5825876342133, 
40.2128369128332, 40.5727856047451, 48.3333457401022, 22.921603335999, 
25.0721591082402), group = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L), .Label = c("a", "b", "c", 
"d", "e"), class = "factor")), class = "data.frame", row.names = c(NA, 
-15L)))

2 answers

9


You can apply the function filter package dplyr within a lapply

lapply(mylist, dplyr::filter, group %in% c("a", "c"))

lapply will apply the function filter, with specific arguments: select groups a and c and return an object of the same type as the mylist.

  • Thanks for the answer, @Rafael. But, what if I needed to do this for two variables: group1 and group2? For example, if you wanted to split the dataframes into age (over 30 years) and sex (female) within the list?

  • 2

    If you have another variable you want to filter, you can use the command &, as follows: lapply(mylist, dplyr::filter, group %in% c("a", "c") & sexo == "F") or separate by comma: lapply(mylist, dplyr::filter, group %in% c("a", "c"), sexo == "F")

  • What if I want to search for the partial value/word? For example, instead of masculino, type masc to filter out everything that has this particle.

  • 2

    lapply(mylist, dplyr::filter, group == "a", number > 40, stringr::str_detect(sexo, "Fem"))

5

Applying subset multi-criteria:

lapply(mylist, subset, (group == 'a' | group == 'c') & number > 40)

[[1]]
    number group
2 40.39104     a
7 47.56538     c
8 43.14062     c
9 47.42608     c

[[2]]
    number group
1  47.64770     a
3  48.37883     a
10 49.58259     c

Browser other questions tagged

You are not signed in. Login or sign up in order to post.