我的文件名为content_data
,内容如下
A house is house that must be beautiful house and never regrets the regrets for the baloon in
the baloons. Find the words that must be the repeated words in the file of house and ballons
我们需要以字典形式实现结果,如格式
{'house':4,'baloon':3,'in':4........},
我的意思是{word:count}
任何人都可以让我知道如何做到这一点
答案 0 :(得分:1)
from collections import Counter
from string import punctuation
counter = Counter()
with open('/tmp/content_data') as f:
for line in f:
counter.update(word.strip(punctuation) for word in line.split())
result = dict(counter)
# note: because we have
# isinstance(counter, dict)
# you may as well leave the result as a Counter object
print result