存储文本文件中出现的每个单词的计数

时间:2013-04-01 03:15:28

标签: python

我想将文本文件中出现的每个单词的计数存储在字典中。我的意思是

fob= open('D:/project/report.txt','r')

我能够将这些行存储到一个列表中,但是我需要将这些行分成单个单词并最终存储它们的计数(就像在ditionary中一样)。

lst=fob.radlines()

#This doesn't work, gives error
#AttributeError: 'list' object has no attribute 'split' 
mylst=lst.split()

我该怎么做?什么是有效的方法呢?

1 个答案:

答案 0 :(得分:1)

对于Python 2.7+

from collections import Counter

with open('D:/project/report.txt','r') as fob:
    c = Counter(word for line in fob for word in line.split())

对于Python 2.5+

from collections import defaultdict
dd = defaultdict(int)

with open('D:/project/report.txt','r') as fob:
    for line in fob:
        for word in line.split():
            dd[word] += 1

对于年龄较大的蟒蛇或讨厌defaultdict

的人
d = {}

with open('D:/project/report.txt','r') as fob:
    for line in fob:
        for word in line.split():
            d[word] = d.get(word, 0) + 1