我正在使用Python解析复杂的JSON数据。 JSON数据如下所示:
{
"data": [{
"product_sn": "ABP-145",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 10
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
},
{
"step_name": "step_c",
"progress": {
"total_steps": 15,
"finished_steps": 15
}
}
]
},
{
"product_sn": "ABP-146",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 8
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
}]
}]
}
业务场景是:为了生产产品,我们有几个步骤:step_a,step_b和step_c。要启动step_c,先决条件是:
现在我想获得所有准备启动step_c的product_sn。
目前,我正在使用多个嵌套的' for'循环来处理"嵌套字典和列表"由json.loads()创建的对象。代码冗长而复杂,难以维护。我想知道是否有像JSONPath' JSONPath'用以下的方式做到这一点:
get(
value=data.product_sn,
criteria=(
data.process_data(step_name=="step_a").
progress(total_steps".value == "finished_steps".value) and
$not_exist data.process_data.step_name=="step_c"
)
)
所以我可以得到所有匹配搜索条件的product_sn。
我搜索了这些示例并尝试了jsonpath_ng,jsonpath_rw,但这些示例非常简单。谁能让我知道如何用一些简单的方法实现上述查询?我真的不想使用冗长,复杂和丑陋的嵌套' for#39;循环了。
您可能还会在我的代码下面找到处理此JSON的代码(实际上已经简化了很多来解释我的问题,实际业务要复杂得多):
import json
json_str = '''{
"data": [{
"product_sn": "ABP-145",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 10
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
},
{
"step_name": "step_c",
"progress": {
"total_steps": 15,
"finished_steps": 15
}
}
]
},
{
"product_sn": "ABP-146",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 8
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
}]
},
{
"product_sn": "ABP-147",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 10
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
}]
}]
}'''
json_obj = json.loads(json_str)
valid_products = list()
for product in json_obj.get('data'):
product_sn = product['product_sn']
process_data = product.get("process_data")
if not process_data:
continue
valid_product = False
for step in process_data:
step_name = step['step_name']
if step_name == 'step_c':
valid_product = False
break
elif step_name == 'step_a':
progress = step['progress']
if progress['total_steps'] == progress['finished_steps']:
valid_product = True
else:
valid_product = False
break
if valid_product:
valid_products.append(product_sn)
else:
continue
print(valid_products)
答案 0 :(得分:1)
假设您的JSON对象存储在o
变量:
prods = [p['product_sn'] for p in o['data'] if [a for a in p['process_data'] if a['step_name']=="step_a" and a['progress']['total_steps']==a['progress']['finished_steps']] and not [c for c in p['process_data'] if c['step_name']=="step_c"]]
很抱歉有一个很长的单行,我手边的PyCharm没有将它分成几行所以看起来不错。
您可以在此处查看工作代码:link to repl.it
答案 1 :(得分:0)
您可以使用更实用的方法使其更清洁。
from operator import itemgetter
json_obj = json.loads(json_str)
products = json_obj.get("data")
valid_products = filter(
lambda p: "process_data" in p and
p["process_data"]["step_name"] == "step_a" and
p["process_data"]["step_name"]["progress"]["total_steps"] == p["process_data"]["step_name"]["progress"]["finished_steps"],
products
)
valid_product_sns = map(itemgetter("product_sn"), valid_products)
当然,过滤lambda仍然非常难看。