我正在尝试基于键specs
合并对象,大多数键结构是一致的,考虑到合并仅在company_name
相同的情况下才会发生(在此示例中,我只有一个company_name
,如果多个列表中的 only (名称,{颜色,类型,许可证,说明)相等。
[
{
"company_name": "GreekNLC",
"metadata": [
{
"name": "Bob",
"details": [
{
"color": "black",
"type": "bmw",
"license": "4DFLK",
"specs": [
{
"properties": [
{
"info": [
"sedan",
"germany"
]
},
{
"info": [
"drive",
"expensive"
]
}
]
}
],
"description": "amazing car"
}
]
},
{
"name": "Bob",
"car_details": [
{
"color": "black",
"type": "bmw",
"license": "4DFLK",
"specs": [
{
"properties": [
{
"info": [
"powerful",
"convertable"
]
},
{
"info": [
"drive",
"expensive"
]
}
]
}
],
"description": "amazing car"
}
]
}
]
}
]
我期望以下输出:
[
{
"company_name": "GreekNLC",
"metadata": [
{
"name": "Bob",
"details": [
{
"color": "black",
"type": "bmw",
"license": "4DFLK",
"specs": [
{
"properties": [
{
"info": [
"powerful",
"convertable"
]
},
{
"info": [
"sedan",
"germany"
]
},
{
"info": [
"drive",
"expensive"
]
}
]
}
],
"description": "amazing car"
}
]
}
]
}
]
我到目前为止的代码
headers = ['color', 'license', 'type', 'description']
def _key(d):
return [d.get(i) for i in headers]
def get_specs(b):
_specs = [c['properties'] for i in b for c in i['specs']]
return [{"properties": [i for b in _specs for i in b]}]
def merge(d):
new_merged_list = [[a, list(b)] for a, b in groupby(sorted(d, key=_key), key=_key)]
k = [{**dict(zip(headers, a)), 'specs': get_specs(b)} for a, b in new_merged_list]
return k
result = {'name': merge(c.get("details")) for i in data for c in i.get("metadata")}
print(json.dumps(result))
但不起作用。我得到了
{"name": [{"color": "black", "specs": [{"properties": [{"info":
["amazing", "strong"]}]}]}]}
答案 0 :(得分:1)
您要执行的操作类似于按以下方式进行分组:
company_name
,name
,color
,type
,license
和description
。
您可以将所有汽车的元组制作为键值对,并对所得的元组执行设置操作,然后按复合键分组并重建列表。
from collections import defaultdict
from collections.abc import Hashable
def merge_spec_props(company_data):
keyed_tuples = (
((
co['company_name'],
user['name'],
car_detail['color'],
car_detail['type'],
car_detail['license'],
car_detail['description'],
), (
(k, v
if isinstance(v, Hashable)
else tuple(v))
for k, v in prop.items()
)
)
for co in company_data
for user in co['metadata']
for car_detail in user['car_details']
for spec in car_detail['specs']
for prop in spec['properties']
for k, v in prop.items()
)
uniq = set(keyed_tuples)
grouped = defaultdict(list)
for k, spec in uniq:
grouped[k].append(spec)
merged_lst = [
{
'company_name': company_name,
'metadata': [{
'name': username,
'car_details': [{
'color': car_color,
'type': car_type,
'license': car_license,
'specs': [dict(spec)
for spec in specs
],
'description': desc
}]
}]
}
for (company_name, username, car_color, car_type, car_license, desc), specs in grouped.items()
]
return merged_lst
尽管此实现非常特定于您的数据,但该功能可能无法作为其他类型数据的可重用值。
如果description
与car_details
中的任何一个都不相同,则只有最新的数字会输入到不同的公司。
值得注意的是,它不会在中间字段上合并。一种可行的方法是将数据转换为树并进行后置横向遍历以获取合并的结构。