使用pandas normalizer展平Json文件数据

时间:2017-10-06 07:40:16

标签: python json pandas

我想展平复杂的嵌套json文件。请找到以下示例json数据

{
  "applications": [
  {
      "id": 87334412,
      "name": "cdata1",
      "language": "known",
      "health_status": "unknown",
      "reporting": true,
      "last_reported_at": "2017-10-06T06:30:55+00:00",
      "application_summary": {
        "response_time": 1.2,
        "throughput": 216,
        "error_rate": 0,
        "target": 0.5,
        "ascore": 1,
        "host_count": 3,
        "instance_count": 3
      },
      "settings": {
        "column": 0.5,
        "columns": 7,
        "columns1": true,
        "columns2": false
      },
      "links": {
        "application_data": [
          93818199,
          93819351,
          93819359
        ],
        "servers": [],
        "application_content": [
          32006189,
          87342924,
          47565225
        ]
      }
    },

代码使用:

import json
from pandas.io.json import json_normalize
json_file=open('ptr1.json')
json_data=json.load(json_file)
#print json_data["applications"]
for line in json_data:
    data=json_normalize(line,['name','id'])
    print data

任何人都可以帮助获取以下数据名称,id,last_reported_at,instance_count。注意json文件包含许多id细节

1 个答案:

答案 0 :(得分:1)

IIUC:

In [34]: d = json.loads(json_str)

In [35]: cols = ['id','name','last_reported_at','application_summary.instance_count']

In [36]: pd.io.json.json_normalize(d['applications'])[cols]
Out[36]:
         id    name           last_reported_at  application_summary.instance_count
0  87334412  cdata1  2017-10-06T06:30:55+00:00                                   3
1  87334444  cdata2  2017-10-05T06:30:55+00:00                                   3