如何将DataFrame转换为嵌套JSON

时间:2019-05-09 14:14:06

标签: python json pandas dataframe d3.js

我正在尝试使用solution将dataFrame导出到D3.js的嵌套JSON(分层)JSON中(仅适用于一个级别(父级,子级)

任何帮助将不胜感激。我是python的新手

我的DataFrame包含7个级别 这是预期的解决方案


JSON Example:
    {
    "name": "World",
    "children": [
        {
            "name": "Europe",
            "children": [
                {
                    "name": "France",
                    "children": [
                        {
                             "name": "Paris",
                             "population": 1000000
                         }]
                 }]
          }]
     }

这是python方法:


def to_flare_json(df, filename):
    """Convert dataframe into nested JSON as in flare files used for D3.js"""
    flare = dict()
    d = {"name":"World", "children": []}

    for index, row in df.iterrows():
        parent = row[0]
        child = row[1]
        child1 = row[2]
        child2 = row[3]
        child3 = row[4]
        child4 = row[5]
        child5 = row[6]
        child_value = row[7]

        # Make a list of keys
        key_list = []
        for item in d['children']:
            key_list.append(item['name'])

        #if 'parent' is NOT a key in flare.JSON, append it
        if not parent in key_list:
            d['children'].append({"name": parent, "children":[{"value": child_value, "name1": child}]})
        # if parent IS a key in flare.json, add a new child to it
        else:
            d['children'][key_list.index(parent)]['children'].append({"value": child_value, "name11": child})
    flare = d
    # export the final result to a json file
    with open(filename +'.json', 'w') as outfile:
        json.dump(flare, outfile, indent=4,ensure_ascii=False)
    return ("Done")

[编辑]

这是我的df样本

World   Continent   Region  Country     State   City    Boroughs    Population
1   Europe  Western Europe  France  Ile de France   Paris   17  821964
1   Europe  Western Europe  France  Ile de France   Paris   19  821964
1   Europe  Western Europe  France  Ile de France   Paris   20  821964

2 个答案:

答案 0 :(得分:0)

您想要的结构显然是递归的,所以我做了一个递归函数来填充它:

enc_out

剩下的就是创建字典并调用函数:

def create_entries(df):
    entries = []
    # Stopping case
    if df.shape[1] == 2:  # only 2 columns left
        for i in range(df.shape[0]):  # iterating on rows
            entries.append(
                {"Name": df.iloc[i, 0],
                 df.columns[-1]: df.iloc[i, 1]}
            )
    # Iterating case
    else:
        values = set(df.iloc[:, 0])  # Getting the set of unique values
        for v in values:
            entries.append(
                {"Name": v,
                 # reiterating the process but without the first column
                 # and only the rows with the current value
                 "Children": create_entries(
                     df.loc[df.iloc[:, 0] == v].iloc[:, 1:]
                 )}
            )
    return entries

然后,您只需将字典写入JSON文件即可。

我希望我的评论足够明确,其想法是递归地将数据集的第一列用作“名称”,其余部分用作“子代”。

答案 1 :(得分:0)

谢谢Syncrossus的回答,但这会导致每个行政区或城市的分支机构不同 结果是这样的:

"Name": "World",
    "Children": [
        {
            "Name": "Western Europe",
            "Children": [
                {
                    "Name": "France",
                    "Children": [
                        {
                            "Name": "Ile de France",
                            "Children": [
                                {
                                    "Name": "Paris",
                                    "Children": [
                                        {
                                            "Name": "17ème",
                                            "Population": 821964
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        },{
            "Name": "Western Europe",
            "Children": [
                {
                    "Name": "France",
                    "Children": [
                        {
                            "Name": "Ile de France",
                            "Children": [
                                {
                                    "Name": "Paris",
                                    "Children": [
                                        {
                                            "Name": "10ème",
                                            "Population": 154623
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }


但是期望的结果是这个


"Name": "World",
    "Children": [
      {
        "Continent": "Europe",
        "Children": [
          {
            "Region": "Western Europe",
            "Children": [
              {
                "Country": "France",
                "Children": [
                  {
                    "State": "Ile De France",
                    "Children": [
                      {
                        "City": "Paris",
                        "Children": [
                          {
                            "Boroughs": "17ème",
                            "Population": 82194
                          },
                          {
                            "Boroughs": "16ème",
                            "Population": 99194
                          }
                        ]
                      },
                      {
                        "City": "Saint-Denis",
                        "Children": [
                            {
                              "Boroughs": "10ème",
                              "Population": 1294
                            },
                            {
                              "Boroughs": "11ème",
                              "Population": 45367
                            }
                          ]
                        }
                      ]
                    }
                  ]
                },
                {
                  "Country": "Belgium",
                  "Children": [
                    {
                      "State": "Oost-Vlaanderen",
                      "Children": [
                        {
                          "City": "Gent",
                          "Children": [
                            {
                              "Boroughs": "2ème",
                              "Population": 1234
                            },
                            {
                              "Boroughs": "4ème",
                              "Population": 7456
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]