搜索包含另一个文档中的字符串的所有句子

时间:2013-10-04 16:02:57

标签: python string-matching

我有一个包含200个单词的文件,每个单词都在一个新行上。 我想在另一个文件中搜索所有这些单词。我想要打印包含这些单词之一的每个句子。 现在,只出现第一个单词的匹配。之后,它停止了。

corpus = open('C:\\Users\\Lucas\\Desktop\\HAIT\\Scriptie\\Tweet-corpus\\Corpus.txt', 'r', encoding='utf8')

with open('C:\\Users\\Lucas\\Desktop\\HAIT\\Scriptie\\Tweet-corpus\\MostCommon3.txt', 'r', encoding='utf8') as list:
for line in list:
    for a in corpus:
        if line in a:
            print(a)

1 个答案:

答案 0 :(得分:2)

# Prepare the list of words
word_file = open('wordfile', 'r', encoding='utf8')
words = [word.strip() for word in word_file.readlines()]
word_file.close()

# Now examine each sentence:
with open('sentencefile') as sentences:
    for sentence in sentences:
        found = False
        for word in words:
            if word in sentence:
                found = True
                break
        if found:
            print sentence