我有一个单词出现词典和一个同义词词典。
单词出现字典示例:
word_count = {'grizzly': 2, 'panda': 4, 'beer': 3, 'ale': 5}
同义词词典示例:
synonyms = {
'bear': ['grizzly', 'bear', 'panda', 'kodiak'],
'beer': ['beer', 'ale', 'lager']
}
我想将字数字典汇总/重命名为
new_word_count = {'bear': 6, 'beer': 8}
我以为我会试试这个:
new_dict = {}
for word_key, word_value in word_count.items(): # Loop through word count dict
for syn_key, syn_value in synonyms.items(): # Loop through synonym dict
if word_key in [x for y in syn_value for x in y]: # Check if word in synonyms
if syn_key in new_dict: # If so:
new_dict[syn_key] += word_value # Increment count
else: # If not:
new_dict[syn_key] = word_value # Create key
但是这不起作用,new_dict结束为空。另外,有更简单的方法吗?也许使用词典理解?
答案 0 :(得分:4)
In [11]: {w: sum(word_count.get(x, 0) for x in ws) for w, ws in synonyms.items()}
Out[11]: {'bear': 6, 'beer': 8}
使用collections.Counter
和dict.get
:
from collections import Counter
ec = Counter()
for x, vs in synonyms.items():
for v in vs:
ec[x] += word_count.get(v, 0)
print(ec) # Counter({'bear': 6, 'beer': 8})
答案 1 :(得分:1)
让我们稍微改变您的同义词词典。我们不是将单词从单词映射到所有同义词的列表,而是从单词映射到其父同义词(即ale
到beer
)。这应该加快查找速度
synonyms = {
'bear': ['grizzly', 'bear', 'panda', 'kodiak'],
'beer': ['beer', 'ale', 'lager']
}
synonyms = {syn:word for word,syns in synonyms.items() for syn in syns}
现在,让我们制作你的聚合词典:
word_count = {'grizzly': 2, 'panda': 4, 'beer': 3, 'ale': 5}
new_word_count = {}
for word,count in word_count:
word = synonyms[word]
if word not in new_word_count:
new_word_count[word] = 0
new_word_count[word] += count