Question

我正在尝试检测一个单词出现在txt文件中的次数，但该单词与其他字母有关。

检测 Hello


文字：你好 oo，你好吗？


预期产出：1

以下是我现在的代码：

total = 0

with open('text.txt') as f:
    for line in f:
        finded = line.find('Hello')
        if finded != -1 and finded != 0:
            total += 1

print total´

你知道我该如何解决这个问题吗？

Answer 1

对于每一行，您可以通过在空格上分割线来迭代每个单词，这使得该行成为单词列表。然后，遍历单词并检查字符串是否在单词中：

total = 0

with open('text.txt') as f:
    # Iterate through lines
    for line in f:
        # Iterate through words by splitting on spaces
        for word in line.split(' '):
            # Match string in word
            if 'Hello' in word:
                total += 1

print total

Answer 2

正如@SruthiV在评论中所建议的那样，您可以使用re模块中的re.findall，

import re

pattern = re.compile(r"Hello")

total = 0
with open('text.txt', 'r') as fin:
    for line in fin:
         total += len(re.findall(pattern, line))

print total

re.compile为正则表达式创建了一种模式，"Hello"。使用re.compile可以提高程序性能，并且（有些人）建议重复使用相同的模式。更多here。

程序的剩余部分打开文件，逐行读取，并使用re.findall查找每行中模式的出现次数。由于re.findall返回匹配列表，因此将使用该列表的长度更新total，即给定行中的匹配数。

注意：此程序会将所有出现的Hello计为单独的单词或作为其他单词的一部分。此外，它区分大小写，因此不会计算hello。

检测连接的文本

2 个答案: