Question

我有两个嵌套字典列表：

lofd1 = [{'A': {'facebook':{'handle':'https://www.facebook.com/pages/New-Jersey/108325505857259','logo_id': None}, 'contact':{'emails':['nj@nj.gov','state@nj.gov']},'state': 'nj', 'population':'12345', 'capital':'Jersey','description':'garden state'}}]
lofd2 = [{'B':{'building_type':'ranch', 'city':'elizabeth', 'state':'nj', 'description':'the state close to NY'}}]

我需要：

使用“ state”键的值将列表中的相似词典合并（例如，将“ state” =“ nj”的所有词典合并到单个词典中
它应包括两个字典中都存在一次的键/值组合（例如，两个字典的“状态”应为“ nj”）
它应包含键/值组合，这些键/值组合在一个词典中不存在（例如，lofd1中的“人口”，“资本”和lofd2中的“ building_type”，“ city”）。
字典中的某些值应排除在外，例如'logo_id'：None
将“说明”中的值从两个字典中放入字符串列表，例如“” description”：['花园州'，'靠近纽约州的州']'

最终数据集应如下所示：

lofd_final = [{'state': 'nj', 'facebook':{'handle':'https://www.facebook.com/pages/New-Jersey/108325505857259'},'population':'12345', 'capital':'Jersey', 'contact':{'emails':['nj@nj.gov','state@nj.gov']}, 'description': ['garden state','the state close to NY'],'building_type':'ranch', 'city':'elizabeth'}]

什么是有效的解决方案？

Answer 1

这是非常适合您的情况的解决方案。就时间复杂度而言； O(n*m)，n是列表中词典的数量，m是字典中键的数量。您只需要一次查看每个词典中的每个键。

def extract_data(lofd, output):
    for d in lofd:
        for top_level_key in d: # This will be the A or B key from your example
            data = d[top_level_key] 
            state = data['state']
            if state not in output: # Create the state entry for the first time
                output[state] = {}
            # Now update the state entry with the data you care about
            for key in data:
                # Handle descriptions
                if key == 'description':
                    if 'description' not in output[state]:
                        output[state]['description'] = [data['description']]
                    else:
                        output[state]['description'].append(data['description'])
                # Handle all other keys
                else:
                    # Handle facebook key (exclude logo_id)
                    if key == 'facebook':
                        del data['facebook']['logo_id']
                    output[state][key] = data[key]

output = {}
extract_data(lofd1, output)
extract_data(lofd2, output)
print(list(output.values()))

output将是一个命令，以顶级键为状态。要将其转换为指定的方式，只需将值提取到平面列表中即可：list(output.values())（请参见上面的示例）。

注意：我假设不需要深层副本。因此，在提取数据之后，我假设您不去操作lofd1和lofd2中的值。这也完全基于给出的规格，例如如果需要排除更多嵌套键，则需要自己添加额外的过滤器。

通过Python中的相似值合并两个嵌套字典列表

1 个答案: