我有一个主词典来保持整个语料库的单词频率,而各个词典则要为每个文本文件保持单词频率。我遍历每个文件,生成每个文件的WF,然后依次更新主词典。我的代码如下。有捷径吗?谢谢!
master_dict = {}
for txtfile in txtfiles:
file_dict = {}
file_dict = get_word_freq(txtfile) #A function is defined
for k, v in file_dict.items():
if k in master_dict:
master_dict[k] += v
else:
master_dict[K] = v
答案 0 :(得分:4)
您应该考虑使用python拥有的“ Counter”类。
from collections import Counter
words_a = 'one two three'
words_b = 'one two one two'
words_c = 'three four five'
a = Counter(words_a.split())
b = Counter(words_b.split())
c = Counter(words_c.split())
print(a + b + c)
# outputs Counter({'one': 3, 'two': 3, 'three': 2, 'four': 1, 'five': 1})