提取json列表中的嵌套项目

时间:2019-11-01 20:36:09

标签: json python-3.x dictionary url nested

我有一个嵌套的json,结构如下。对于每个json,我要迭代地从各个级别提取数据-第一级别为“ StreamId”,然后在“ Summary_breakdown”中提取“ TaskID”(即TaskID_1,TaskID_2),并在每个“ TaskID”中提取除“ tools_items”之外的所有项目因为它太长了,可能会导致数据框内部出现问题。

我想将其写为字典,并最终在数据框中进行分析。

{
     "success": true,
     "resource": {
         "StreamId": "xyz",
         "Summary_Measures": {
             "Summary_Report": {
                "Total_Cost": 7000,
                "Total_hours": 6087,
                "Summary_breakdown": {
                        "TaskID_1": {
                            "Task_details": "abc",
                            "Task_cost": 300,
                            "Task_hours": 87,
                            "tools_items": "an_extremely_long_string"
                            },
                        "TaskID_2": {
                            "Task_details": "defgyh",
                            "Task_cost": 400,
                            "Task_hours": 6000,
                            "tools_items": "another_extremely_long_string"
                      }
                   }
                }
           },
       }
}

我设法生成了一个URL列表并将json响应存储在一个列表中,但是我无法在脚本第二部分的每个“ Task_ID”中提取“ Task_ID”层和参数“回复项目”。我试图绕过“ Task_ID”层,但是代码仍未运行。任何解决方案和建议,不胜感激!

import json
import pandas as pd
from urllib.request import urlopen

stream_id = ['sdfhef', 'VVqdhi']
myurl_link = [] 
for id in stream_id:
    endpoint = "https://~/%s/~" % id
    myurllink.append(endpoint)
    myjslist = []
    for link in myurl_link:
        g = urlopen(link).read().decode('UTF-8')
        g_resp = json.loads(g) 
        myjslist.append(g_resp)


responseitem = []
for item in myjslist:
    stream = item['resource']['StreamId']
    taskdetails = item['resource']['Summary_Measures']['Summary_Report']['Summary_breakdown'][0]['Task_details']
    taskcost = item['resource']['Summary_Measures']['Summary_Report']['Summary_breakdown'][0]['Task_cost']
    taskhours = item['resource']['Summary_Measures']['Summary_Report']['Summary_breakdown'][0]['Task_hours']
    responseitem.append({'taskdetails':taskdetails, 'taskcost':taskcost, 'taskhours': taskhours})

with open('responseitem.json', 'a') as f:
    json.dump(responseitem, f)
    f.write("\n")



0 个答案:

没有答案