使用count方法计算文本文件中的某个单词

时间:2016-07-31 02:02:06

标签: python file text count

我试图计算“'”这个词的次数。出现在两本保存为文本文件的书中。我运行的代码为每本书返回零。

这是我的代码:

(1024 * 1024 * 10)

我在这里做错了什么?

4 个答案:

答案 0 :(得分:3)

您为每次迭代重新分配word_count。这意味着最后它将与文件最后一行中the的出现次数相同。你应该得到这笔钱。另一件事:there应该匹配吗?可能不是。您可能想要使用line.split()。此外,您可以直接遍历文件对象;不需要.readlines()。最后,使用生成器表达式来简化。我的第一个例子是没有生成器表达式;第二个是它:

def word_count(filename):
    with open(filename) as f_obj:
        total = 0
        for line in f_obj:
            total += line.lower().split().count('the')
        print(total)
def word_count(filename):
    with open(filename) as f_obj:
        total = sum(line.lower().split().count('the') for line in f_obj)
        print(total)

答案 1 :(得分:1)

除非每个文件的最后一行显示“the”这个词,否则你会看到零。

您可能希望将word_count变量初始化为零,然后使用扩充加法(+=):

例如:

def word_count(filename):
    """Count specified words in a text"""
    try:
        word_count = 0                                       # <- change #1 here
        with open(filename) as f_obj:
            contents = f_obj.readlines()
            for line in contents:
                word_count += line.lower().count('the')      # <- change #2 here
            print(word_count)

    except FileNotFoundError:
        msg = "Sorry, the file you entered, " + filename + ", could not be     found."
    print(msg)

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash   Course\\TEXT files\\dracula.txt'
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt'

word_count(dracula)
word_count(siddhartha)

增加添加不是必需的,只是有帮助。这一行:

word_count += line.lower().count('the')

可以写成

word_count = word_count + line.lower().count('the')

但是你也不需要一次将所有行读入内存。您可以直接从文件对象迭代线。例如:

def word_count(filename):
    """Count specified words in a text"""
    try:
        word_count = 0
        with open(filename) as f_obj:
            for line in f_obj:                     # <- change here
                word_count += line.lower().count('the')
        print(word_count)

    except FileNotFoundError:
        msg = "Sorry, the file you entered, " + filename + ", could not be     found."
        print(msg)

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt'
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt'

word_count(dracula)
word_count(siddhartha)

答案 2 :(得分:1)

另一种方式:

with open(filename) as f_obj:
    contents = f_obj.read()
    print("The word 'the' appears " + str(contents.lower().count('the')) + " times")

答案 3 :(得分:0)

import os
def word_count(filename):
    """Count specified words in a text"""
    if os.path.exists(filename):
        if not os.path.isdir(filename):
            with open(filename) as f_obj:
                print(f_obj.read().lower().count('t'))
        else:
            print("is path to folder, not to file '%s'" % filename)
    else:
        print("path not found '%s'" % filename)