Question

我创建了一个用字符串计算kewords的小程序。如您所见，关键字存储在txt文件中。今天我意识到，如果字符串中的单词重复，则关键字计数器不会增加其值。特别是在这种情况下，“错误”是txt文件中的关键字，结果计数变量是1而不是2。

如何让它发挥作用，重复的话也会计算在内？

source_text = 'this is wrong. What is wrong with you?' 
source_words = source_text.split()
count = 0    

word_list = []
with open('pozit.txt') as inputfile:
    for line in inputfile:
        word_list.append(line.strip())

for word in word_list:
    if word in source_words:
        count += 1

Answer 1

您可以使用.count()：

with open('pozit.txt') as inputfile:
    count = 0
    for line in inputfile:
        count += line.count('wrong')

如果您只想要语言意义上的单词，请查看nltk＆＃39; s tokenizer module。

通过Python字符串循环搜索关键字

1 个答案: