Python - 用于计算多个单词的应用程序

时间:2017-01-10 17:30:15

标签: python dictionary

我必须创建一个程序来计算文本文件中的单词数。

所以,我的计划:

-user输入txt文件的名称,

-app将其加载到变量'text',

- 将其设为小写,

- 只搜索没有'/''#'等字符的单词,没有空格等只有字母字符串

- 将其制作成单词列表,

- 显示所有单词,1st应该有最大的用途,最后应该至少使用1次

如何更改它以包含最小长度+3的单词?示例:in,on,at< - 不应包含列表,单词,显示,清除< - 应包括在内。

from collections import Counter
import re


def open_file():
    file_name = input("Enter a filename: ")  # enter name of file which should be open
    with open(file_name) as f:  # it should exist in project folder
        text = f.read()  # load file into var text
    f.close()  # close the file
    return text

try:
    text = open_file()  # open file and write it into var
    except FileNotFoundError:
        print("File was not found!")
        text = "" # if FileNotFoundError = True -> text = none

    lower_text = text.lower()  # transform txt into lower cases
    text_with_out_special_signs = re.findall(r'[a-z]*', lower_text)  #delete signs like =,#,!

    counts_of_words = Counter(text_with_out_special_signs)  # transform list in Counter

    for x in counts_of_words.most_common():  # show results
        print(x)

1 个答案:

答案 0 :(得分:1)

如果你想删除少于3个字符的单词,你可以这样做:

text_more_than_3_char_words = [w for w in text_with_out_special_signs if len(w) > 2]
counts_of_words = Counter(text_more_than_3_char_words)  # transform list in Counter