R使用JSONLITE进行分层JSON?

时间:2017-04-13 18:50:12

标签: json r jsonlite

我的最终游戏是使用D3js从分层JSON文件创建树形可视化。

我需要表示的层次结构是这个图,其中A有子B,C,D; B有孩子E,F,G; C有孩子H,我;而且D没有孩子。节点将具有多个键:值对。为简单起见,我仅列出了3个。

                             -- name:E
                            |   type:dkBlue
                            |   id: 005
                            |
                            |-- name:F
            -- name:B ------|   type:medBlue 
            |  type:blue    |   id: 006
            |  id:002       |
            |               |-- name:G
            |                   type:ltBlue
 name:A ----|                   id:007     
 type:colors|
 id:001     |-- name:C  ----|-- name:H
            |   type:red    |   type:dkRed         
            |   id:003      |    id:008
            |               |  
            |               |
            |               |-- name:I
            |                   type:medRed
            |                   id:009
            |-- name:D
                type:green
                id: 004

我在R中的源数据如下:

nodes <-read.table(header = TRUE, text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")

links <- read.table(header = TRUE, text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

我必须将其转换为以下JSON:

{"name": "A",
 "type": "colors",
 "id" : "001",
 "children": [
    {"name": "B",
      "type": "blue",
      "id"  : "002", 
      "children": [
          {"name": "E",
           "type": "dkBlue",
           "id"  : "003"},
          {"name": "F", 
           "type": "medBlue",
           "id": "004"},
          {"name": "G", 
           "type": "ltBlue",
           "id": "005"}
    ]},
    {"name": "C",
      "type": "red",
      "id"  : "006", 
      "children": [
          {"name": "H",
           "type": "dkRed",
           "id"  : "007"},
          {"name": "I", 
           "type": "dkBlue",
           "id": "008"}
    ]},
    {"name": "D",
      "type": "green",
      "id"  : "009"}
]}  

感谢您提供的任何帮助!

[更新2017-04-18]

根据Ian的参考文献,我查看了R的data.tree。如果我重构我的数据,我可以重新创建我的层次结构,如下所示。请注意,我在每个节点之间丢失了关系类型(hasSubcat),其值在现实生活中可能因每个链接/边缘而异。如果我能够获得可行的层次结构,我愿意放手(现在)。 data.tree的修订数据:

df <-read.table(header = TRUE, text = "
paths  type     id 
A      colors   001
A/B    blue     002
A/B/E  dkBlue   005
A/B/F  medBlue  006
A/B/G  ltBlue   007
A/C    red      003
A/C/H  dkRed    008
A/C/I  medRed   009
A/D    green    004
")

myPaths <- as.Node(df, pathName = "paths")
myPaths$leafCount / (myPaths$totalCount - myPaths$leafCount)
print(myPaths, "type", "id", limit = 25)

打印显示我在原始帖子中勾勒出的层次结构,甚至包含每个节点的键:值。尼斯!

  levelName    type id
1 A          colors  1
2  ¦--B        blue  2
3  ¦   ¦--E  dkBlue  5
4  ¦   ¦--F medBlue  6
5  ¦   °--G  ltBlue  7
6  ¦--C         red  3
7  ¦   ¦--H   dkRed  8
8  ¦   °--I  medRed  9
9  °--D       green  4

我再次失去了如何将其从树转换为嵌套的JSON。与大多数示例一样,此处的示例https://ipub.com/data-tree-to-networkd3/假设键:值对仅在叶节点上,而不是分支节点上。我认为答案是创建一个嵌套列表以提供给JSONIO或JSONLITE,我不知道如何做到这一点。

3 个答案:

答案 0 :(得分:3)

data.tree非常有帮助,可能是实现目标的更好方法。为了好玩,我将使用JSONigraph提交更加迂回的方式来实现您的嵌套d3r

nodes <-read.table(header = TRUE, text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")

links <- read.table(header = TRUE, text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

library(d3r)
library(dplyr)
library(igraph)

# make it an igraph
gf <- graph_from_data_frame(links[,c(1,3,2)],vertices = nodes)

# if we know that this is a tree with root as "A"
#  we can do something like this
df_tree <- dplyr::bind_rows(
  lapply(
    all_shortest_paths(gf,from="A")$res,
    function(x){data.frame(t(names(unclass(x))), stringsAsFactors=FALSE)}
  )
)

# we can discard the first column
df_tree <- df_tree[,-1]
# then make df_tree[1,1] as 1 (A)
df_tree[1,1] <- "A"

# now add node attributes to our data.frame
df_tree <- df_tree %>%
  # let's get the last non-NA in each row so we can join with nodes
  mutate(
    last_non_na = apply(df_tree, MARGIN=1, function(x){tail(na.exclude(x),1)})
  ) %>%
  # now join with nodes
  left_join(
    nodes,
    by = c("last_non_na" = "name")
  ) %>%
  # now remove last_non_na column
  select(-last_non_na)

# use d3r to nest as we would like
nested <- df_tree %>%
  d3_nest(value_cols = c("ID", "type"))

答案 1 :(得分:1)

考虑沿着级别向下迭代地将数据帧列转换为多嵌套列表:

library(jsonlite)
...
df2list <- function(i) as.vector(nodes[nodes$name == i,])

# GRANDPARENT LEVEL
jsonlist <- as.list(nodes[nodes$name=='A',])
# PARENT LEVEL       
jsonlist$children <- lapply(c('B','C','D'), function(i) as.list(nodes[nodes$name == i,]))
# CHILDREN LEVEL
jsonlist$children[[1]]$children <- lapply(c('E','F','G'), df2list)
jsonlist$children[[2]]$children <- lapply(c('H','I'), df2list)

toJSON(jsonlist, pretty=TRUE)

但是,通过这种方法,您会注意到一些元素的内部子元素用括号括起来。因为R在字符向量内不能有复杂类型,所以整个对象必须是在括号中输出的列表类型。

因此,考虑用嵌套的gsub清除额外的括号,它仍然呈现有效的json:

output <- toJSON(jsonlist, pretty=TRUE)

gsub('"\\]\n', '"\n', gsub('"\\],\n', '",\n', gsub('": \\["', '": "', output)))

最终输出

{
  "ID": "001",
  "name": "A",
  "type": "colors",
  "children": [
    {
      "ID": "002",
      "name": "B",
      "type": "blue",
      "children": [
        {
          "ID": "005",
          "name": "E",
          "type": "dkBlue"
        },
        {
          "ID": "006",
          "name": "F",
          "type": "medBlue"
        },
        {
          "ID": "007",
          "name": "G",
          "type": "ltBlue"
        }
      ]
    },
    {
      "ID": "003",
      "name": "C",
      "type": "red",
      "children": [
        {
          "ID": "008",
          "name": "H",
          "type": "dkRed"
        },
        {
          "ID": "009",
          "name": "I",
          "type": "medRed"
        }
      ]
    },
    {
      "ID": "004",
      "name": "D",
      "type": "green"
    }
  ]
} 

答案 2 :(得分:1)

一个不错的,如果有点难以包裹一个人的头,这样做的方式是具有自我参照功能,如下所示......

nodes <- read.table(header = TRUE, colClasses = "character", text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")

links <- read.table(header = TRUE, colClasses = "character", text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

convert_hier <- function(linksDf, nodesDf, sourceId = "startID", 
                         targetId = "endID", nodesID = "ID") {
  makelist <- function(nodeid) {
    child_ids <- linksDf[[targetId]][which(linksDf[[sourceId]] == nodeid)]

    if (length(child_ids) == 0) 
      return(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]))

    c(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]), 
      children = list(lapply(child_ids, makelist)))
  }

  ids <- unique(c(linksDf[[sourceId]], linksDf[[targetId]]))
  rootid <- ids[! ids %in% linksDf[[targetId]]]
  jsonlite::toJSON(makelist(rootid), pretty = T, auto_unbox = T)
}

convert_hier(links, nodes)

一些笔记......

  1. 我在colClasses = "character"命令中添加了read.table,这样ID号就不会强制转换为没有前导零的整数,因此字符串不会转换为因子。
  2. 我将所有内容都包含在convert_hier函数中,以便更轻松地适应其他场景,但真正的神奇之处在于makelist函数。