刮擦JSON数组嵌套标记

时间:2016-06-23 19:29:06

标签: python arrays json web-scraping urllib

我正在尝试从JSON文件中抓取数据。我能够从一些标签中抓取数据,但很少有嵌套标签给出问题。以下是文件中的示例 -

{"orders":[{
  "order_id":9000,
  "flight_start":"2017-06-15T05:00:00.000Z",
  "flight_end":"2017-06-22T05:00:00.000Z",
  "spots":[{
      "spot_id":7354259,
      "spot_length":15}],
  "constraints":{
      "forbid":[{
        "network":"BRVO"},
        {"network":"DSE"},
        {"network":"ESPN"},
        {"network":"DFC"},
        {"hours":[2,6],
         "days_of_week":["Monday","Tuesday","Thursday","Friday"]},
        {"hours":[2,6],
         "days_of_week":["Saturday","Sunday"]}],
      "allocation":[{
         "hours":[6,9],
         "impressions":{
             "min":0.05,
             "max":0.05},
         "days_of_week":["Monday","Tuesday","Wednesday","Thursday","Friday"]},{
         "hours":[20,0],
         "impressions":{"min":0.5,"max":0.5},
         "days_of_week":["Monday","Tuesday","Wednesday","Thursday","Friday"]},{
         "budget":{
             "min":1,
             "max":1},
         "spot_length":15}]}}]}

我无法从网络代码中删除所有值,它只返回每个订单的所有网络标签中的最高值。

我使用以下代码 -

 import urllib
 import json
 url = 'http://vw-test.elasticbeanstalk.com/test'
 json_obj = urllib.request.urlopen(url).read().decode('UTF-8')
 data = json.loads(json_obj)
 for i in data["orders"]:
     k = i["order_id"]
     j = i["flight_start"]
     l = i["flight_end"]
     m = i ['spots']
     for  value in m:    
         a = value["spot_length"]
         b = value["spot_id"]
     n = i["constraints"]
     c = n["forbid"]
     d = c[0]
     e = d["network"]
     print(e)

如果有人能帮我解决这个问题,我将非常感激。

1 个答案:

答案 0 :(得分:1)

您问题中的json数据尚未完成。做出一些假设,这可能有效:

for i in data["orders"]:
    k = i["order_id"]
    j = i["flight_start"]
    l = i["flight_end"]
    m = i ['spots']
    for  value in m:
        a = value["spot_length"]
        b = value["spot_id"]
    n = i["constraints"]
    c = n["forbid"]
    d = c[0]
    networks = [d["network"] for d in c if "network" in d]
    print(networks)