通过ID合并两个字典在单独的列表中

时间:2019-06-09 19:55:07

标签: python arrays merge

我正在尝试基于键specs合并对象,大多数键结构是一致的,考虑到合并仅在company_name相同的情况下才会发生(在此示例中,我只有一个company_name,如果多个列表中的 only (名称,{颜色,类型,许可证,说明)相等。

[
{
    "company_name": "GreekNLC",
    "metadata": [
        {
            "name": "Bob",
            "details": [
                {
                    "color": "black",
                    "type": "bmw",
                    "license": "4DFLK",
                    "specs": [
                        {
                            "properties": [
                                {
                                    "info": [
                                        "sedan",
                                        "germany"
                                    ]
                                },
                                {
                                    "info": [
                                        "drive",
                                        "expensive"
                                    ]
                                }
                            ]
                        }
                    ],
                    "description": "amazing car"
                }
            ]
        },
        {
            "name": "Bob",
            "car_details": [
                {
                    "color": "black",
                    "type": "bmw",
                    "license": "4DFLK",
                    "specs": [
                        {
                            "properties": [
                                {
                                    "info": [
                                        "powerful",
                                        "convertable"
                                    ]
                                },
                                {
                                    "info": [
                                        "drive",
                                        "expensive"
                                    ]
                                }
                            ]
                        }
                    ],
                    "description": "amazing car"
                }
            ]
        }
    ]
}
]

我期望以下输出:

[
{
    "company_name": "GreekNLC",
    "metadata": [
        {
            "name": "Bob",
            "details": [
                {
                    "color": "black",
                    "type": "bmw",
                    "license": "4DFLK",
                    "specs": [
                        {
                            "properties": [
                                {
                                    "info": [
                                        "powerful",
                                        "convertable"
                                    ]
                                },
                                {
                                    "info": [
                                        "sedan",
                                        "germany"
                                    ]
                                },
                                {
                                    "info": [
                                        "drive",
                                        "expensive"
                                    ]
                                }
                            ]
                        }
                    ],
                    "description": "amazing car"
                }
            ]
        }
    ]
}
]

我到目前为止的代码

headers = ['color', 'license', 'type', 'description']

def _key(d):
  return [d.get(i) for i in headers]

def get_specs(b):
  _specs = [c['properties'] for i in b for c in i['specs']]
  return [{"properties": [i for b in _specs for i in b]}]

def merge(d):
  new_merged_list = [[a, list(b)] for a, b in groupby(sorted(d, key=_key), key=_key)]
  k = [{**dict(zip(headers, a)), 'specs': get_specs(b)} for a, b in new_merged_list]
  return k

result = {'name': merge(c.get("details")) for i in data for c in i.get("metadata")}

print(json.dumps(result))

但不起作用。我得到了

{"name": [{"color": "black", "specs": [{"properties": [{"info": 
["amazing", "strong"]}]}]}]}

1 个答案:

答案 0 :(得分:1)

您要执行的操作类似于按以下方式进行分组: company_namenamecolortypelicensedescription

您可以将所有汽车的元组制作为键值对,并对所得的元组执行设置操作,然后按复合键分组并重建列表。

from collections import defaultdict
from collections.abc import Hashable

def merge_spec_props(company_data):
    keyed_tuples = (
                ((
                co['company_name'],
                user['name'], 
                car_detail['color'], 
                car_detail['type'], 
                car_detail['license'],
                car_detail['description'],
                ), (
                    (k, v 
                    if isinstance(v, Hashable)
                    else tuple(v))
                    for k, v in prop.items()
                    )
                )
                for co in company_data
                for user in co['metadata']
                for car_detail in user['car_details']
                for spec in car_detail['specs']
                for prop in spec['properties'] 
                for k, v in prop.items()
                )
    uniq = set(keyed_tuples)
    grouped = defaultdict(list)
    for k, spec in uniq:
        grouped[k].append(spec)

    merged_lst = [
        {
            'company_name': company_name, 
            'metadata': [{
                'name': username,
                'car_details': [{
                        'color': car_color,
                        'type': car_type,
                        'license': car_license,
                        'specs': [dict(spec)
                            for spec in specs
                        ],
                        'description': desc
                }]
            }]
        }
        for (company_name, username, car_color, car_type, car_license, desc), specs in grouped.items()
    ]

    return merged_lst

尽管此实现非常特定于您的数据,但该功能可能无法作为其他类型数据的可重用值。 如果descriptioncar_details中的任何一个都不相同,则只有最新的数字会输入到不同的公司。

值得注意的是,它不会在中间字段上合并。一种可行的方法是将数据转换为树并进行后置横向遍历以获取合并的结构。