使用Python创建D3嵌套JSON数据

时间:2014-04-01 04:56:05

标签: python json d3.js

我正在尝试构建一个Python函数,它将数据格式化为D3使用的JSON字符串。

我需要格式为:

{
 "name": "flare",
 "children": [
  {
   "name": "analytics",
   "children": [
    {
     "name": "cluster",
     "children": [
      {"name": "AgglomerativeCluster", "size": 3938},
      {"name": "CommunityStructure", "size": 3812},
      {"name": "HierarchicalCluster", "size": 6714},
      {"name": "MergeEdge", "size": 743}
     ]
    },

http://bl.ocks.org/mbostock/4063550 对于此类型:http://johan.github.io/d3/ex/tree.html

到目前为止,我想出的是一个数据结构,如:

{'nlp':{'course':['course','range','topics','language','processing','word']}}

需要它像:

{
   "name":"Natural Language Processing",
   "children":[
      {
         "name":"course",
         "children":[
            {
               "name":"course",
               "size":700
            },
            {
               "name":"range",
               "size":700
            },
            {
               "name":"topics",
               "size":700
            },
            {
               "name":"language",
               "size":700
            },
            {
               "name":"processing",
               "size":700
            },
            {
               "name":"word",
               "size":700
            }
         ]
      }
   ]
}

开始走

之路
def format_d3_circle(data_input):
    d3_data = {};
    #root level
    d3_data['name'] = data_input[data_input.keys()[0]].keys()[0]
    sub_levels = data_input[data_input.keys()[0]]
    for level_one_key, level_one_data in sub_levels:
        d3_data['children'] = []
    return json.dumps(d3_data)

但似乎我没有正确地解决问题,并且发现很难有效地想象一个用于创建JSON节点的好解决方案。

有关如何抽象此问题的任何建议,并从字典/列表/ JSON输入等构建我需要的任何嵌套JSON结构?

1 个答案:

答案 0 :(得分:1)

这是我一直在研究的解决方案,它适用于表格输入数据,适用于任意数量级别的一般情况。

import pandas as pd
import json

def find_element(children_list,name):
    """
    Find element in children list
    if exists or return none
    """
    for i in children_list:
        if i["name"] == name:
            return i
    #If not found return None
    return None

def add_node(path,value,nest):
    """
    The path is a list.  Each element is a name that corresponds 
    to a level in the final nested dictionary.  
    """

    #Get first name from path
    this_name = path.pop(0)

    #Does the element exist already?
    element = find_element(nest["children"], this_name)

    #If the element exists, we can use it, otherwise we need to create a new one
    if element:

        if len(path)>0:
            add_node(path,value, element)

    #Else it does not exist so create it and return its children
    else:

        if len(path) == 0:
            nest["children"].append({"name": this_name, "value": value})
        else:
            #Add new element
            nest["children"].append({"name": this_name, "children":[]})

            #Get added element 
            element = nest["children"][-1]

            #Still elements of path left so recurse
            add_node(path,value, element)

这是一个如何使用它的例子。您必须告诉它使用哪些列作为层次结构的级别以及哪个列存储值。

df = pd.read_json('{"l1":{"0":"a","1":"a","2":"a","3":"a","4":"b","5":"b","6":"b","7":"b"},"l2":{"0":"a1","1":"a1","2":"a2","3":"a2","4":"b1","5":"b1","6":"b2","7":"b3"},"l3":{"0":"a11","1":"a12","2":"a21","3":"a22","4":"b11","5":"b12","6":"b22","7":"b34"},"val":{"0":1,"1":2,"2":3,"3":4,"4":5,"5":6,"6":7,"7":8}}')


d = {"name": "root",
"children": []}

levels = ["l1","l2", "l3"]
for row in df.iterrows():
    r = row[1]
    path = list(r[levels])
    value = r["val"]
    add_node(path,value,d)

print json.dumps(d, sort_keys=False,
              indent=2)