根据嵌套值过滤 Json

时间:2021-03-11 23:21:26

标签: python json

我一直在努力过滤 json 文件并尝试了多种解决方案都没有成功。

我的 json 看起来像这样:

{
  "some site": {
    "https://url.com/123...": {
      "Product Name": "A",
      "Product Price": "1213",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://url.com/456...": {
      "Product Name": "B",
      "Product Price": "59.95",
      "Product Category": "A",
      "Product Availability": "In Stock"
    }
  },
  "some other site": {
    "https://other_url.com/904543...": {
      "Product Name": "C",
      "Product Price": "479.95",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://other_url.com/432489...": {
      "Product Name": "D",
      "Product Price": "5",
      "Product Category": "B",
      "Product Availability": "In Stock"
    }
  }
}

我想根据关键产品可用性==“有货”过滤整个结构,预期结果为:

{
  "some site": {
    "https://url.com/456...": {
      "Product Name": "B",
      "Product Price": "59.95",
      "Product Category": "A",
      "Product Availability": "In Stock"
    }
  },
  "some other site": {
    "https://other_url.com/432489...": {
      "Product Name": "D",
      "Product Price": "5",
      "Product Category": "B",
      "Product Availability": "In Stock"
    }
  }
}

我正在使用 json_load() 读取文件:

def read_json(filename):
    with open(filename, encoding='utf-8') as json_file:
        return json.load(json_file)

最小可重复示例:

import json

data = """
{
  "some site": {
    "https://url.com/123...": {
      "Product Name": "A",
      "Product Price": "1213",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://url.com/456...": {
      "Product Name": "B",
      "Product Price": "59.95",
      "Product Category": "A",
      "Product Availability": "In Stock"
    }
  },
  "some other site": {
    "https://other_url.com/904543...": {
      "Product Name": "C",
      "Product Price": "479.95",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://other_url.com/432489...": {
      "Product Name": "D",
      "Product Price": "5",
      "Product Category": "B",
      "Product Availability": "In Stock"
    }
  }
}"""

products = json.loads(data)

output_dict = [x for x,(z) in products.items() if z["Product Availability"] == "In Stock"]
print(output_dict)

返回 KeyError: 'Product Availability'

任何帮助将不胜感激!

2 个答案:

答案 0 :(得分:3)

您需要使用字典推导来创建字典,而不是列表推导。

Product Availablity 是嵌套在 z 中的字典的键,而不是 z 本身的键。您需要一个嵌套的 dict 理解来过滤每个站点中的产品。

output_dict = {site: {
    url: attributes for url, attributes in p.items() if attributes['Product Availability'] == "In Stock"
    } for site, p in products.items()}

使用常规嵌套循环可以更容易理解。

output_dict = {}
for site, product_dict in products.items():
    output_site = {}
    for url, attributes in product_dict.items():
        if attributes['Product Availability'] == 'In Stock':
            output_site[url] = attributes
    output_dict[site] = output_site

答案 1 :(得分:1)

您可以使用嵌套的 dict 理解来实现,但乍一看并不容易理解:

{key: {s: v for s, v in val.items() if v.get("Product Availability") == "In Stock"} for key, val in data.items()}

给出:

{
    "some site": {
        "https://url.com/456...": {
            "Product Name": "B",
            "Product Price": "59.95",
            "Product Category": "A",
            "Product Availability": "In Stock"
        }
    },
    "some other site": {
        "https://other_url.com/432489...": {
            "Product Name": "D",
            "Product Price": "5",
            "Product Category": "B",
            "Product Availability": "In Stock"
        }
    }
}

...但实际上,使用嵌套循环可能更易于管理:

import json

site_data = json.reads('your JSON...')
result = {}

for site_title, urls in site_data.items():
    result[site_title] = {}
    for url, url_data in urls.items():
        if site_data.get("Product Availability") == "In Stock":
            result[site_title][url] = url_data

结果是一样的。

相关问题