我有一个句子列表,我想找出所有不包含至少一个单词的句子,这些单词与另一个列表中包含的单词匹配。我尝试使用列表理解为
[sentence for sentence in sentences if word_list is not in sentence]
不起作用,因为我要问单词列表中的单词是否不在句子中。
我需要的主要功能是能够识别所有没有与单词列表中的单词匹配的单词的句子。我正在寻找ASR错误,并且有一个单词列表,每个句子中至少必须有一个单词,否则该句子就会出现ASR错误。
我可以弄清楚如何使用grep -v
并将它们全部管道化,但是我想用Python做到这一点。
答案 0 :(得分:2)
我想你是说
[sentence for sentence in sentences if all(word not in sentence for word in word_list)]
作为更一般的指导,如果逻辑比您一次想到的复杂得多,请不要理解。
答案 1 :(得分:2)
如果要在word_list中标识不包含任何单词的句子,请使用以下一行:
In [1]: word_list = ['USA', 'JAPAN', 'RUSSIA']
In [2]: sentences = ['I went to USA from JAPAN', 'there was no mail', 'I really dont belie
...: ve RUSSIA did it']
In [3]: [sentence for sentence in sentences if not any(word in sentence for word in word_
...: list)]
Out[3]: ['there was no mail']
答案 2 :(得分:0)
您可以以O(n ^ 2)的时间复杂度来做到这一点。
no_match = [sentence for sentence in sentences
if [word for word in sentence if word in word_list]]
这等于:
no_match = []
for sentence in sentences:
words = []
for word in sentence:
if word in word_list:
words.append(word)
if not words:
no_match.append(sentence)