创建df以生成给定格式的json

时间:2017-10-30 13:42:42

标签: python json pandas dataframe data-analysis

我正在尝试生成一个df,以便在json下面生成这个。

Json数据:

{
 "name": "flare",
 "children":  [
    {
     "name": "K1",
     "children": [
      {"name": "Exact", "size": 4},
      {"name": "synonyms", "size": 14}
     ]
    },
    {
     "name": "K2",
     "children": [
      {"name": "Exact", "size": 10},
      {"name": "synonyms", "size": 20}
     ]
    },
     {
     "name": "K3",
     "children": [
      {"name": "Exact", "size": 0},
      {"name": "synonyms", "size": 5}
     ]
    }, 
    {
     "name": "K4",
     "children": [
      {"name": "Exact", "size": 13},
      {"name": "synonyms", "size": 15}
     ]
    },
    {
     "name": "K5",
     "children": [
      {"name": "Exact", "size": 0},
      {"name": "synonyms", "size": 0}
     ]
    }
 ]
}

输入数据:

name    Exact   synonyms
K1        4       14
K2        10      20
K3        0       5
K4        13      15
K5        0       0

我尝试在json中使用值创建df但是我无法在df.to_json上获得所需的json,请帮忙。

1 个答案:

答案 0 :(得分:1)

您需要按set_index + stack重新塑造数据,然后将groupbyapply一起用于嵌套list of dict

import json

df = (df.set_index('name')
        .stack()
        .reset_index(level=1)
        .rename(columns={'level_1':'name', 0:'size'})
        .groupby(level=0).apply(lambda x: x.to_dict(orient='records'))
        .reset_index(name='children')
        )

print (df)
  name                                           children
0   K1  [{'name': 'Exact', 'size': 4}, {'name': 'synon...
1   K2  [{'name': 'Exact', 'size': 10}, {'name': 'syno...
2   K3  [{'name': 'Exact', 'size': 0}, {'name': 'synon...
3   K4  [{'name': 'Exact', 'size': 13}, {'name': 'syno...
4   K5  [{'name': 'Exact', 'size': 0}, {'name': 'synon...

#convert output to dict
j = { "name": "flare", "children":  df.to_dict(orient='records')}
#for nice output - easier check
import pprint 
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(j)
{   'children': [   {   'children': [   {'name': 'Exact', 'size': 4},
                                        {'name': 'synonyms', 'size': 14}],
                        'name': 'K1'},
                    {   'children': [   {'name': 'Exact', 'size': 10},
                                        {'name': 'synonyms', 'size': 20}],
                        'name': 'K2'},
                    {   'children': [   {'name': 'Exact', 'size': 0},
                                        {'name': 'synonyms', 'size': 5}],
                        'name': 'K3'},
                    {   'children': [   {'name': 'Exact', 'size': 13},
                                        {'name': 'synonyms', 'size': 15}],
                        'name': 'K4'},
                    {   'children': [   {'name': 'Exact', 'size': 0},
                                        {'name': 'synonyms', 'size': 0}],
                        'name': 'K5'}],
    'name': 'flare'}
#convert data to json and write to file
with open('data.json', 'w') as outfile:
    json.dump(j, outfile)