The manipulation of columns-list within the tidyverse
occurs in the same way that this universe proposes to manipulate lists, that is, with purrr. The difference is that this manipulation takes place within a data.frame
and therefore uses the rules of tidyverse
to manipulate them - staying for example within a mutate()
.
Given this general consideration, let’s go the questions.
Keep variables in the nest()
I see two possible interpretations for the question. In the first one, where the expected result is in the tibble
which is the result of nest()
, the answer is already in the question itself. In the second one, in which the variables are expected to be matched within each tibble
nested, can be solved by adding the new variable in the tibble
nestled with map()
.
my %>%
mutate(data2 = map2(data, kmeans, ~mutate(.x, var = .y)))
# A tibble: 3 x 3
kmeans data data2
<fct> <list> <list>
1 1 <tibble [14 x 6]> <tibble [14 x 7]>
2 3 <tibble [10 x 6]> <tibble [10 x 7]>
3 2 <tibble [6 x 6]> <tibble [6 x 7]>
Bring the column-list to the .GlobalEnv
First of all, if you really intend to stay on frameword tidyverse
, you should not do this. In this case the information should be kept on tibble
. With this exception, the operation can be done in the same way as placed in this question, remembering that for such the list should be named. So we would have:
ls()
[1] "cluster" "dataset" "my"
# Adiciona nomes aos elementos da lista
my$data <- set_names(my$data, paste0("tabela", seq_along(my$data)))
list2env(my$data, .GlobalEnv)
<environment: R_GlobalEnv>
ls()
[1] "cluster" "dataset" "my" "tabela1" "tabela2" "tabela3"
Mutate
Finally, to apply a mutate()
in a column-list it usually happens, but to apply the operation the date element of the column-list (which is what is desired in this case) it is necessary to include a map()
within the mutate()
.
my %>%
mutate(soma = map(data, ~mutate_if(.x, is.numeric, sum)),
final = map2(data, soma, bind_cols)) %>%
select(kmeans, final)
# A tibble: 3 x 2
kmeans final
<fct> <list>
1 1 <tibble [14 x 12]>
2 3 <tibble [10 x 12]>
3 2 <tibble [6 x 12]>
Note that it was enough to include your code as a formula within the map()
for it to work. To produce the expected result in the question I joined the two data.frame
s in a single.
Not every column-list operation needs to result in another list column. To do this, just use some of the map_*()
.
my %>%
mutate(tamanho = map_dbl(data, nrow))
# A tibble: 3 x 3
kmeans data tamanho
<fct> <list> <dbl>
1 1 <tibble [14 x 6]> 14
2 3 <tibble [10 x 6]> 10
3 2 <tibble [6 x 6]> 6
The first one has already been answered in the example itself. No? With the
group_by()
+nest()
...– Tomás Barcellos