Question

我正在开发一个文本搜索项目，并使用文本blob从文本中搜索句子。 TextBlob有效地提取关键字的所有句子。然而，为了进行有效的研究，我还想在之前提出一个句子，之后我无法确定。

以下是我正在使用的代码：

def extraxt_sents(Text,word):
    search_words = set(word.split(','))
        sents = ''.join([s.lower() for s in Text])
        blob = TextBlob(sents)
    matches = [str(s) for s in blob.sentences if search_words & set(s.words)]
    print search_words
    print(matches)

Answer 1

如果您想获得匹配前后的行，您可以创建循环并记住上一行，也可以使用[from:to]，例如blob.sentences列表中的match_region = [map(str, blob.sentences[i-1:i+2]) # from prev to after next for i, s in enumerate(blob.sentences) # i is index, e is element if search_words & set(s.words)] # same as your condition

最好的方法是使用slices bultin函数。

blob.sentences[i-1:i+2]

此处，i-1将提取从索引i+2（包括）到索引map（不包括）的子列表，i-1将此列表中的元素转换为字符串

注意：实际上，您可能希望将max(0, i-1)替换为i-1;否则-1可能是i+2，Python会将此解释为最后一个元素，产生一个空切片。如果{{1}}高于列表的长度，另一方面，这不会是一个问题。

使用python进行文本搜索

1 个答案: