合并到保留不同值的词典

时间:2013-11-25 16:28:14

标签: python dictionary merge

我是python(python 3.2)的新手,我一直在努力解决一个棘手的问题。我有两个列出列表的字典:

d1 = {
'mammals': ['dog', '5', 'cat', '4', 'mouse', '4', 'bat', '3'], 
'bird': ['robin', '8', 'bluejay', '6', 'goose', '5', 'cardinal', '5']
}

d2 = {
'mammals': ['cow', '5', 'horse', '4', 'cat', '4', 'dog', '3', 'beaver', '3'], 
'bird': ['bluejay', '9', 'goose', '8', 'eagle', '8', 'robin', '7', 'duck', '6', 'cardinal', '5']
}

在每个字典中,对名称编号(例如,'dog', '5')对应于原始数据库中存在的所述项目的实例数。

我需要的是以保留数量信息的方式合并两个词典(再次,在示例中,新词典将具有'dog', '5', '3'。因此合并的词典看起来有点像(我是不一定是嵌套字典。我这样写它是为了方便可视化。重要的是保留信息):

d_merged = { 
'mammals': [{'dog': ['5', '3']},  {'cat': ['4', '4']}, {'mouse': '4'}, {'bat': '3'} , {'cow': '5'},
 {'horse': '4'}, {'beaver': '3'}],
'bird': [{'robin': ['8', '7']},  {'bluejay': ['6', '9']}, {'goose': ['5','8']},  {'cardinal': ['5',
 '5']}, {'eagle': '8'},  {'duck', '6'}]
}

我尝试了各种各样的元组,嵌套字典和其他可能性,但结果却是一团糟。如果有人能指出我解决这个问题的良好方向,那将意味着很多。我非常感谢你

2 个答案:

答案 0 :(得分:2)

首先,您可以将d1和d2更改为更易于使用的词典:

[请注意,list [:: 2]是保存偶数索引中所有项目的子列表,list [1 :: 2]保存赔率。]

>>> dc1 = {}
>>> for family in d1.keys():
        l = d1[family]
        dc1[family] = {l[::2][family]:[l[1::2][family]] for family in range(len(l)/2)}


>>> dc2 = {}
>>> for family in d1.keys():
        l = d2[family]
        dc2[family] = {l[::2][family]:[l[1::2][family]] for family in range(len(l)/2)}

现在dc1和dc2是这些:

>>> dc1
{'mammals': {'bat': ['3'], 'mouse': ['4'], 'dog': ['5'], 'cat': ['4']},
 'bird': {'goose': ['5'], 'cardinal': ['5'], 'robin': ['8'], 'bluejay': ['6']}}
>>> dc2
{'mammals': {'beaver': ['3'], 'horse': ['4'], 'dog': ['3'], 'cow': ['5'], 'cat': ['4']}, 
'bird': {'eagle': ['8'], 'bluejay': ['9'], 'goose': ['8'], 'cardinal': ['5'], 'duck': ['6'], 'robin': ['7']}}

然后你只需要将它们组合起来

>>> d_merged = {}
>>> families = set(d1.keys()+d2.keys())
>>> family2animals = {family:list(set(dc1[family].keys()+dc2[family].keys())) for family in families}
>>> for family in families:
        d_merged[family] = [{animal:dc1[family].get(animal,[])+dc2[family].get(animal,[])} for animal in family2animals[family]]

答案 1 :(得分:2)

最可读的方法可能如下:

output = {}
for key in d1.keys():
    output[key] = {}
    lst = d1[key]
    for name, count in (lst[i:i+2] for i in range(0, len(lst), 2)):
        output[key][name] = (int(count),)
for key in d2.keys():
    if key not in output:
        output[key] = {}
    lst = d2[key]
    for name, count in (lst[i:i+2] for i in range(0, len(lst), 2)):
        if name in output[key].keys():
            output[key][name] += (int(count),)
        else:
            output[key][name] = (int(count),) 

在难以理解的词典理解中,你可以分两步完成

d = {k: {a: int(b) for a, b in (v[i:i+2] for i in range(0, len(v), 2))} 
     for k, v in d.items()}

将它们变成词典词典,例如

{'mammals': {'cat': 4, 'cow': 5, 'dog': 3, 'beaver': 3, 'horse': 4}, 
 'bird': {'goose': 8, 'duck': 6, 'eagle': 8, 'bluejay': 9, 'robin': 7, 'cardinal': 5}}

然后

output = {k1: {k2: (d1.get(k1, {}).get(k2), d2.get(k1, {}).get(k2)) 
          for k2 in set(list(d1.get(k1, {}).keys()) + list(d2.get(k1, {}).keys()))} 
          for k1 in set(list(d1.keys()) + list(d2.keys()))}

将两者结合起来。

请注意,即使两个级别的密钥不同,这些方法也都有效(例如,添加d1['reptiles'] = {'lizard': 10})。