一个打开文本文件的程序,计算单词的数量,并按照它们在文件中出现的次数报告排序的前N个单词?

时间:2013-07-05 16:42:13

标签: python file word-count

大家都是编程的初学者,我最近被赋予了创建这个程序的任务,我发现它很难。我以前设计过一个程序来计算用户输入的句子中的单词数,是否可以修改这个程序来达到我想要的目的?

import string
def main():
  print "This program calculates the number of words in a sentence"
  print
  p = raw_input("Enter a sentence: ")
  words = string.split(p)
  wordCount = len(words)
  print "The total word count is:", wordCount
main()

4 个答案:

答案 0 :(得分:6)

使用collections.Counter计算单词,使用open()打开文件:

from collections import Counter
def main():
    #use open() for opening file.
    #Always use `with` statement as it'll automatically close the file for you.
    with open(r'C:\Data\test.txt') as f:
        #create a list of all words fetched from the file using a list comprehension
        words = [word for line in f for word in line.split()]
        print "The total word count is:", len(words)
        #now use collections.Counter
        c = Counter(words)
        for word, count in c.most_common():
           print word, count
main()

collections.Counter示例:

>>> from collections import Counter
>>> c = Counter('aaaaabbbdddeeegggg')

Counter.most_common根据计数以排序顺序返回单词:

>>> for word, count in c.most_common(): 
...     print word,count
...     
a 5
g 4
b 3
e 3
d 3

答案 1 :(得分:1)

要打开文件,您可以使用open功能

from collections import Counter
with open('input.txt', 'r') as f:
    p = f.read() # p contains contents of entire file
    # logic to compute word counts follows here...

    words = p.split()

    wordCount = len(words)
    print "The total word count is:", wordCount

    # you want the top N words, so grab it as input
    N = int(raw_input("How many words do you want?"))

    c = Counter(words)
    for w, count in c.most_common(N):
       print w, count

答案 2 :(得分:0)

import re
from collections import Counter

with open('file_name.txt') as f:
    sentence = f.read()

words = re.findall(r'\w+', sentence)
word_counts = Counter(words)

答案 3 :(得分:0)

如果其他人收到错误消息以供输入,则您可以尝试使用此消息,

代码:

N = int(input("\nHow many words do you want: "))