Question

我已使用xlrd将excel工作表值附加到列表中。我把名单叫做a_master。我有一个带有单词的文本文件，我想要计算出现在这个列表中的出现次数（我称这个文件字典和他们每行1个单词）。这是代码：

with open("dictionary.txt","r") as f:
for line in f:
    print "Count " + line + str((a_master).count(line))

出于某种原因，对于文本文件中存在的每个计数字，计数返回零。如果我自己写出其中一个单词的计数：

 print str((a_master).count("server"))

计算出现没问题。我也试过

print line

为了查看它是否正确地看到了dictionary.txt文件中的单词而且它是。

Answer 1

从文件读取的行由换行符终止。最后可能还有白色空间。在进行查找之前最好去除任何空格

with open("dictionary.txt","r") as f:
    for line in f:
        print "Count " + line + str((a_master).count(line.strip()))

注意理想情况下，搜索列表是线性的，在大多数情况下可能不是最佳的。我认为collections.Counter适合您所描述的情况。

将您的列表重新解释为字典，其中键是项目，值是通过collections.Counter传递的事件，如下所示

a_master = collections.Counter(a_master)

您可以将代码重写为

from itertools import imap
with open("dictionary.txt","r") as f:
    for line in imap(str.strip, f):
        print "Count {} {}".format(line, a_master[line])

Answer 2

使用collections.Counter()：

import re
import collections
words = re.findall(r'\w+', open('dictionary.txt').read().lower())
collections.Counter(words)

为什么这个问题会被标记为xlrd？

使用Python

2 个答案: