如何将这种嵌套的JSON以列式形式转换为Pandas数据帧

时间:2015-02-04 11:03:01

标签: python pandas

我可以将这种嵌套的JSON格式以列式格式读入pandas。

JSON Scheme

JSON方案格式

enter image description here

Python脚本

    req = requests.get(REQUEST_API)
    returned_data = json.loads(req.text)
    # status
    print("status: {0}".format(returned_data["status"]))
    # api version
    print("version: {0}".format(returned_data["version"]))
    data_in_columnar_form = pd.DataFrame(returned_data["data"])
    data = data_in_columnar_form["data"]

更新

我想将以下JSON方案转换为表格式,表格如何?

inline

JSON Scheme

     "data":[  
        {  
           "year":"2009",
           "values":[  
              {  
                 "Actual":"(0.2)"
              },
              {  
                 "Upper End of Range":"-"
              },
              {  
                 "Upper End of Central Tendency":"-"
              },
              {  
                 "Lower End of Central Tendency":"-"
              },
              {  
                 "Lower End of Range":"-"
              }
           ]
        },
        {  
           "year":"2010",
           "values":[  
              {  
                 "Actual":"2.8"
              },
              {  
                 "Upper End of Range":"-"
              },
              {  
                 "Upper End of Central Tendency":"-"
              },
              {  
                 "Lower End of Central Tendency":"-"
              },
              {  
                 "Lower End of Range":"-"
              }
           ]
        },...
        ]

1 个答案:

答案 0 :(得分:8)

Pandas有JSON normalization函数(截至0.13),直接来自文档:

In [205]: from pandas.io.json import json_normalize

In [206]: data = [{'state': 'Florida',
   .....:           'shortname': 'FL',
   .....:           'info': {
   .....:                'governor': 'Rick Scott'
   .....:           },
   .....:           'counties': [{'name': 'Dade', 'population': 12345},
   .....:                       {'name': 'Broward', 'population': 40000},
   .....:                       {'name': 'Palm Beach', 'population': 60000}]},
   .....:          {'state': 'Ohio',
   .....:           'shortname': 'OH',
   .....:           'info': {
   .....:                'governor': 'John Kasich'
   .....:           },
   .....:           'counties': [{'name': 'Summit', 'population': 1234},
   .....:                        {'name': 'Cuyahoga', 'population': 1337}]}]
   .....: 

In [207]: json_normalize(data, 'counties', ['state', 'shortname', ['info', 'governor']])
Out[207]: 
         name  population info.governor    state shortname
0        Dade       12345    Rick Scott  Florida        FL
1     Broward       40000    Rick Scott  Florida        FL
2  Palm Beach       60000    Rick Scott  Florida        FL
3      Summit        1234   John Kasich     Ohio        OH
4    Cuyahoga        1337   John Kasich     Ohio        OH