Question

我有一个包含字符串的列表，这些字符串包含来自文本正文的描述，如下所示：

stringlist = ['I have a dog and cat and the dog is seven years old', 'that dog is old']

我需要通过位于另一个列表中的关键字列表来过滤这些字符串：

keywords = ['dog', 'cat', 'old']

并将每个关键字附加到一行，具体取决于它在字符串中的位置。

filteredlist = [['dog', 'dog', 'cat', 'old'], ['dog', 'old']]

我正在拆分stringslist中的字符串并使用list comprehension来检查关键字是否在列表中，但在循环关键字时输出不正确。

当我使用一个特定字符串进行搜索时，代码正常工作：

filteritem = 'dog'
filteredlist = []
for string in stringlist:
    string = string.split()
    res = [x for x in string if filteritem in x]
    filteredlist.append(res)

生成的过滤列表如下：

filteredlist = [['dog', 'dog'], ['dog']]

为关键字在字符串序列中的每个实例附加关键字。

当我尝试使用for循环遍历关键字列表时，输出会丢失结构。

filteredlist = []
for string in stringlist:
    string = string.split()
    for keyword in keywords:
        res = [x for x in string if keyword in x]
        filteredlist.append(res)

这是输出：

filteredlist =  [['dog', 'dog'], ['cat'], ['old'], [], ['dog'], [], ['old'], []]

我认为我完全错误地解决了这个问题，因此任何其他方法或解决方案都会有所帮助。

Answer 1

您可以将其写为嵌套列表理解

>>> [[word for word in string.split() if word in keywords] for string in stringlist]
[['dog', 'cat', 'dog', 'old'], ['dog', 'old']]

使用循环过滤包含关键字列表的字符串列表

1 个答案: