将分层data.frame表示为嵌套列表

时间:2013-01-25 20:08:36

标签: regex r recursion

如何很好地将带有分层信息的data.frame转换为JSON(或嵌套列表)?

假设我们有以下data.frame:

df <- data.frame(
  id = c('1', '1.1', '1.1.1', '1.2'), 
  value = c(10, 5, 5, 5)) 

#  id   value
#     1    10
#   1.1     5
# 1.1.1     5
#   1.2     5

然后我想得到以下JSON:

{
 "id": "1",
 "value": 10,
 "children": [
  {
   "id": "1.1",
   "value": 5,
   "children": [
    {
     "id": "1.1.1", 
     "value": 5 
    }
   ]
  },
  {
   "id": "1.2",
   "value": 5
  }
 ]
}

id定义层次结构,.是分隔符。

我的目的是能够轻松地将数据从R转换为分层D3可视化(例如Partition LayoutZoomable Treemaps)。如果可以添加更多“值”列,也会很好;例如valuesizeweight

谢谢!

编辑:我恢复了原来的问题,因此更容易理解所有答案(对所有编辑都很抱歉)。

2 个答案:

答案 0 :(得分:3)

我倾向于安装RJSONIO来执行此操作:

R> df <- data.frame(id = c('1', '1.1', '1.1.1', '1.2'), value = c(10, 5, 5, 5)) 
R> RJSONIO::toJSON(df)
[1] "{\n \"id\": [ \"1\", \"1.1\", \"1.1.1\", \"1.2\" ],\n\"value\": [     10,      5,      5,      5 ] \n}"
R> cat(RJSONIO::toJSON(df), "\n")
{
 "id": [ "1", "1.1", "1.1.1", "1.2" ],
"value": [     10,      5,      5,      5 ] 
} 
R> 

这是您想要的输出,但所需的嵌套/层次结构存在于data.frame中。我想如果你在一个列表中嵌入一个data.frame,你就会到达那里。

编辑:对于您修改过的问题,这里是您输入JSON的R输出:

R> RJSONIO::fromJSON("/tmp/foo.json")
$id
[1] "1"

$value
[1] 10

$children
$children[[1]]
$children[[1]]$id
[1] "1.1"

$children[[1]]$value
[1] 5

$children[[1]]$children
$children[[1]]$children[[1]]
$children[[1]]$children[[1]]$id
[1] "1.1.1"

$children[[1]]$children[[1]]$value
[1] 5




$children[[2]]
$children[[2]]$id
[1] "1.2"

$children[[2]]$value
[1] 5



R> 

答案 1 :(得分:1)

可能的解决方案。

首先我定义以下功能:

# Function to get the number hierarchical dimensions (occurences of "." + 1)
ch_dim <- function(x, delimiter = ".") {
    x <- as.character(x)
    chr.count <- function(x) length(which(unlist(strsplit(x, NULL)) == delimiter))
    if (length(x) > 1) {
        sapply(x, chr.count) + 1
    } else {
        chr.count(x) + 1
    }
}

# Function to convert a hierarchical data.frame to a nested list
lst_fun <- function(ch, id_col = "id", num = min(d), stp = max(d)) {

    # Convert data.frame to character
    ch <- data.frame(lapply(ch, as.character), stringsAsFactors=FALSE)

    # Get number of hierarchical dimensions
    d <- ch_dim(ch[[id_col]])

    # Convert to list
    lapply(ch[d == num,][[id_col]], function(x) {
        tt <- ch[grepl(sprintf("^%s.", x), ch[[id_col]]),]
        current <- ch[ch[[id_col]] == x,]
        if (stp != num && nrow(tt) > 0) { 
            c(current, list(children = lst_fun(tt, id_col, num + 1, stp)))
        } else { current }
    })
}

然后将data.frame转换为列表:

lst <- lst_fun(df, "id")

最后,JSON:

s <- RJSONIO::toJSON(lst)