我正在尝试编写一个可以合并两个词典的程序(TEXT FILES!)。这些词典由名词和动词组成,这些名词和动词由另一个程序从不同的语料库中索引(然后放入文本文件中)。这是这些词典的形式:
dict1 = {'strawberry': [['eat', 1]], 'family-member': [['look up', 1]], 'mall': [['search', 1]]}
dict2 = {'strawberry': [['eat', 1]], 'family-member': [['lose', 1]], 'ovation': [['receive', 1]], 'mall': [['build', 1]]}
正如您所看到的,它们是带字符串的词典,列表中包含值列表。 现在我正试图得到这样的输出:
finaldict = {'strawberry': [['eat', 2]], 'family-member': [['look up', 1]['lose',1]], 'mall': [['search', 1]['build', 1]], 'ovation': [['receive', 1]]
到现在为止,我已经能够像这样合并dict1和dict2(在一个字符串中):
{'strawberry': [['eat', 1]], 'family-member': [['look up', 1]], 'mall': [['search',
1]], 'strawberry': [['eat', 1]], 'family-member': [['lose', 1]], 'ovation':
[['receive', 1]], 'mall': [['build', 1]]}
我将此字符串转换为带有下一个语句的字典: finaldict = eval(str1) 它将整个事物变成了一个字典,当我要求最终形式时,它也会这样说,但它不会将[['eat',1]]等语句看作值或任何东西。我需要这个,所以我可以遍历每个项目,并计算出它与动词的出现次数。
答案 0 :(得分:1)
from collections import Counter
dict1 = {'strawberry': [['eat', 1]], 'family-member': [['look up', 1]], 'mall': [['search', 1]]}
dict2 = {'strawberry': [['eat', 1]], 'family-member': [['lose', 1]], 'ovation': [['receive', 1]], 'mall': [['build', 1]]}
result = {k: Counter(dict(v)) for k, v in dict1.items()}
for k, v in dict2.items():
result.setdefault(k, Counter()).update(dict(v))
result = {k: [list(x) for x in v.items()] for k, v in result.items()}
答案 1 :(得分:0)
没有什么太花哨的,只是打破它。
from collections import defaultdict
dict1 = {'strawberry': [['eat', 1]], 'family-member': [['look up', 1]], 'mall': [['search', 1]]}
dict2 = {'strawberry': [['eat', 1]], 'family-member': [['lose', 1]], 'ovation': [['receive', 1]], 'mall': [['build', 1]]}
keys = set(dict2.keys()).union(dict1.keys())
final = {}
for k in keys:
d1val = dict1.get(k, [])
d2val = dict2.get(k, [])
resd = defaultdict(lambda: 0)
for word, count in d1val:
resd[word] += count
for word, count in d2val:
resd[word] += count
final[k] = [list(i) for i in resd.items()]