Question

尝试在文本文件中打印出N个最常用的单词。到目前为止，我有文件系统和计数器，一切正常，只是无法弄清楚如何以一种漂亮的方式打印我想要的一定量。这是我的代码。

import re
from collections import Counter

def wordcount(user):
"""
Docstring for word count.
"""
file=input("Enter full file name w/ extension: ")
num=int(input("Enter how many words you want displayed: "))

with open(file) as f:
  text = f.read()

words = re.findall(r'\w+', text)

cap_words = [word.upper() for word in words]

word_counts = Counter(cap_words)


char, n = word_counts.most_common(num)[0]
print ("WORD: %s \nOCCURENCE: %d " % (char, n) + '\n')

基本上，我只是想去制作一种打印出以下内容的循环...

例如num = 3

因此它将打印出3个最常用的单词及其计数。 WORD：Blah发生：3 词：bloo出现次数：2 单词：blee出现次数：1

Answer 1

我会迭代＆＃34;最常见的＆＃34;如下：

most_common = word_counts.most_common(num)  # removed the [0] since we're not looking only at the first item!    
for item in most_common:
        print("WORD: {} OCCURENCE: {}".format(item[0], item[1]))

两条评论：
1.使用format()格式化字符串而不是% - 您稍后会感谢我提供此建议！
2.通过这种方式，您可以迭代任何数量的＆＃34;前N＆＃34;结果没有硬编码＆＃34; 3＆＃34;进入你的代码。

Answer 2

保存最常见的元素并使用循环。

common = word_counts.most_common(num)[0]
for i in range(3):
    print("WORD: %s \nOCCURENCE: %d \n" % (common[i][0], common[i][1]))

仅打印特定数量的计数器项目，格式正常

2 个答案: