在Python 3.2 / Windows环境中处理单词出现次数应用程序。
任何人都可以帮忙告诉我为什么以下不起作用?
from string import punctuation
from operator import itemgetter
N = 100
words = {}
words_gen = (word.strip(punctuation).lower() for line in open("poi_run.txt")
for word in line.split())
for word in words_gen:
words[word] = words.get(word, 0) + 1
top_words = (words.iteritems(), key=itemgetter(1), reverse=True)[:N]
for word, frequency in top_words:
print ("%s %d") % (word, frequency)
追溯错误是:
Message File Name Line Position
Traceback
<module> C:\Users\will\Desktop\word_count.py 13
AttributeError: 'dict' object has no attribute 'iteritems'
由于
n.b。
完全正常工作的代码:
from string import punctuation
from operator import itemgetter
N = 100
words = {}
words_gen = (word.strip(punctuation).lower() for line in open("poi_run.txt")
for word in line.split())
for word in words_gen:
words[word] = words.get(word, 0) + 1
top_words = sorted(words.items(), key=itemgetter(1), reverse=True)[:N]
for word, frequency in top_words:
print ("%s %d" % (word, frequency))
再次感谢你们
答案 0 :(得分:4)
在Python 3中,只使用您之前使用items
的{{1}}。
新的items()
会返回支持迭代的dictionary view object以及iteritems
和len
。
当然,在in
您忘了拨打top_words = (words.iteritems(), ...
功能。
编辑:请参阅我的其他答案以获得更好的解决方案。
答案 1 :(得分:4)
考虑collections
模块中的Counter
类 - 它将为您执行第一个for
循环:
from collections import Counter
N = 100
words_gen = ...
top_words = Counter(words_gen).most_common(N)
for word, frequency in top_words:
print("%s %d" % (word, frequency))
答案 2 :(得分:2)
来自Python 3.x implementation documents
“另外,dict.iterkeys(),dict.iteritems()和dict.itervalues() 方法不再受支持。“
请参阅上面的链接,以便真正获得3.x
的正确API最简单的方法是使用map()或filter()来获取迭代键。