data.frame hierarchy for nested list in R

Asked

Viewed 202 times

6

I got the following data.frame d:

x <- data.frame(a=letters[1:3], b=letters[4:6], c=letters[7:9], stringsAsFactors=F)
d <- tidyr::expand(x)
d

Source: local data frame [27 x 3]

   a b c
1  a d g
2  a d h
3  a d i
4  a e g
5  a e h
6  a e i
7  a f g
8  a f h
9  a f i
10 b d g
11 b d h
12 b d i
13 b e g
14 b e h
15 b e i
16 b f g
17 b f h
18 b f i
19 c d g
20 c d h
21 c d i
22 c e g
23 c e h
24 c e i
25 c f g
26 c f h
27 c f i

I would like to get a nested list like this (to work with JSON and things like that):

$a
$a$d
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"
$a$e
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"
$a$f
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"

$b
$b$d
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"
$b$e
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"
$b$f
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"

$c
$c$d
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"
$c$e
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"    
$c$f
[1] "g" "h" "i" "g" "h" "i" "g" "h" "i"

To do this, I used the function tree down below

tree <- function(d) {
  aux <- with(d, split(b, list(a)))
  res <- lapply(aux, function(x) with(d, split(c, list(x))))
  res
}

Now, say I have one data.frame hierarchical with n columns. How do I create a nested list?

Thank you!

Related questions

https://stackoverflow.com/questions/7247108/problems-splitting-data-frame-into-a-nested-list https://stackoverflow.com/questions/17951334/hierarchical-data-frame-to-json-with-irregular-nodes

2 answers

5


A solution using recursion would be:

rec_split <- function(df){
 if(ncol(df) == 2){
  l <- split(df[[2]], df[[1]])
 }else{
   l <- split(df[-1], df[[1]])
   lapply(l, rec_split)
 }
}

Example:

rec_split(d)
$a
$a$d
[1] "g" "h" "i"
$a$e
[1] "g" "h" "i"
$a$f
[1] "g" "h" "i"

$b
$b$d
[1] "g" "h" "i"
$b$e
[1] "g" "h" "i"
$b$f
[1] "g" "h" "i"

$c
$c$d
[1] "g" "h" "i"
$c$e
[1] "g" "h" "i"
$c$f
[1] "g" "h" "i"
  • 1

    By the Blade of Oompa, its solution is equivalent and uses fewer resources so I’ll put it as accepted. But I think the solution of Evandro Dalbem is great; even could use for other more complex problems.

4

Most of the process you’ve done. It only lacks some structure that allows doing this recursively to the penultimate column (and the latter would be a result vector, I understand). This one seemed like an ingenious solution to your problem: https://stackoverflow.com/questions/11539026/split-data-frame-apply-function-and-return-results-in-a-nested-list

library(plyr)

nested.dlply <- function(df, by, fun, ...) {
   if (length(by) == 1) {
      dlply(df, by, fun, ...)
   } else {
      dlply(df, by[1], nested.dlply, by[-1], fun, ...)
   }
}

x <- data.frame(a=letters[1:3], b=letters[4:6], c=letters[7:9], stringsAsFactors=F)
d <- tidyr::expand(x)

var.names <- names(d)
n <- length(var.names)
d.list <- nested.dlply(d, var.names[-n], function(x)x[, var.names[n]])

jsonlite::toJSON(d.list, pretty=TRUE) # Se quiser transformar num JSON

And the result is

> d.list$a
$d
[1] "g" "h" "i"

$e
[1] "g" "h" "i"

$f
[1] "g" "h" "i"

You can improve how to do this, for example in the code above I am accessing an object outside the scope of the anonymous function that will return the last column to the nested list.

  • Thanks! hadn’t thought to use recursion!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.