通过python计算文本文件中的单词

时间:2015-09-17 00:53:36

标签: python string parsing python-3.x

下面是我分配的项目,我必须编写一个程序来打开文本文件并对文件中的单词进行计数。下面我概述了一个程序,它应该为任何空白情况(即空格,制表符,\ n)设置previous = False,并为任何其他情况设置previous = True。当存在previous = False并且没有检测到空格(单词的开头)的情况时,它将向wordCount添加1。但是,我的输出显示略有不同的结果(如下所示)。限制是我不能使用.split()函数,必须手工完成。因为这是一项学校任务,我不是在寻找有人为我做这件事,而只是教我一点并解释我做错了什么。

代码:

"""
title: findWords.py
author: Riley Lloyd
"""

#import necessary libraries
import turtle as bob

def init():
    """

    :return:
    """
    pass

def countWords ( textFileName ):
    """

    :param textFileName:
    :return:
    """
    previous = False
    wordCount = 0
    for text in open(textFileName):
        print(text)
        if text == " ":
            previous = False
        elif text == "\n":
            previous = False
        elif text == "\t":
            previous = False
        else:
            if previous == False:
                wordCount += 1
            previous = True
    print(wordCount)


def main():
    """

    :return:
    """
    init()
    countWords(input("Enter filename: "))

main()

结果:

Enter filename: some.txt
Words make up other words.

This is a line.

  Sequences of words make sentences.

I like words but I don't like MS Word.

    There's another word for how I feel about MSWord: @#%&

1

Process finished with exit code 0

1 个答案:

答案 0 :(得分:4)

您正在迭代打开的文件 -

for text in open(textFileName):

当你这样做时,你实际上正在迭代文件的行,所以在第一次迭代text将是文件的第一行,在第二次迭代中text将是第二行但是,你的逻辑是这样编写的,它希望text成为文件中的每个字符。

如果您的文件不大,我建议您执行.read()并对其进行迭代。示例 -

def countWords ( textFileName ):
    """

    :param textFileName:
    :return:
    """
    with open(textFileName) as f:
        texts = f.read()
        previous = False
        wordCount = 0
        for text in texts:
            print(text)
            if text == " ":
                previous = False
            elif text == "\n":
                previous = False
            elif text == "\t":
                previous = False
            else:
                if previous == False:
                    wordCount += 1
                previous = True
        print(wordCount)

我已使用with语句打开文件,您还应该使用with语句打开文件,它会自动为您处理文件的关闭。