Python字数和排名

时间:2011-10-25 11:47:05

标签: python python-3.x

在Python 3.2 / Windows环境中处理单词出现次数应用程序。

任何人都可以帮忙告诉我为什么以下不起作用?

from string import punctuation
from operator import itemgetter

N = 100
words = {}

words_gen = (word.strip(punctuation).lower() for line in open("poi_run.txt")
                                         for word in line.split())

for word in words_gen:
    words[word] = words.get(word, 0) + 1

top_words = (words.iteritems(), key=itemgetter(1), reverse=True)[:N]

for word, frequency in top_words:
    print ("%s %d") % (word, frequency)

追溯错误是:

Message File Name   Line    Position    
Traceback               
    <module>    C:\Users\will\Desktop\word_count.py 13      
AttributeError: 'dict' object has no attribute 'iteritems'              

由于

n.b。

完全正常工作的代码:

from string import punctuation
from operator import itemgetter

N = 100
words = {}

words_gen = (word.strip(punctuation).lower() for line in open("poi_run.txt")
                                         for word in line.split())

for word in words_gen:
    words[word] = words.get(word, 0) + 1

top_words = sorted(words.items(), key=itemgetter(1), reverse=True)[:N]

for word, frequency in top_words:
    print ("%s %d" % (word, frequency))

再次感谢你们

3 个答案:

答案 0 :(得分:4)

在Python 3中,只使用您之前使用items的{​​{1}}。

新的items()会返回支持迭代的dictionary view object以及iteritemslen

当然,在in您忘了拨打top_words = (words.iteritems(), ...功能。


编辑:请参阅我的其他答案以获得更好的解决方案。

答案 1 :(得分:4)

考虑collections模块中的Counter类 - 它将为您执行第一个for循环:

from collections import Counter

N = 100
words_gen = ...

top_words = Counter(words_gen).most_common(N)

for word, frequency in top_words:
    print("%s %d" % (word, frequency))

答案 2 :(得分:2)

来自Python 3.x implementation documents

  

“另外,dict.iterkeys(),dict.iteritems()和dict.itervalues()   方法不再受支持。“

请参阅上面的链接,以便真正获得3.x

的正确API

最简单的方法是使用map()或filter()来获取迭代键。