假设我有这个清单:
sentences = ['the cat slept', 'the dog jumped', 'the bird flew']
我想过滤掉包含以下列表中的字词的任何句子:
terms = ['clock', 'dog']
我应该得到:
['the cat slept', 'the bird flew']
我试过这个解决方案,但它不起作用
empty = []
if any(x not in terms for x in sentences):
empty.append(x)
解决这个问题的最佳方法是什么?
答案 0 :(得分:0)
为了便于阅读,我会选择这样的解决方案,而不是简化为一个班轮:
for sentence in sentences:
if all(term not in sentence for term in terms):
empty.append(sentence)
答案 1 :(得分:0)
使用列表理解的简单蛮力O(m * n)方法:
对于每个句子 - 检查在这句话中是否找到任何不允许的条款,如果没有匹配则允许判刑。
[s for s in sentences if not any(t in s for t in terms)]
# ['the cat slept', 'the bird flew']
显然,您也可以将条件反转为:
[s for s in sentences if all(t not in s for t in terms)]
答案 2 :(得分:0)
与上述两个答案类似但使用过滤器,可能更接近问题规范:
filter(lambda x: all([el not in terms for el in x.split(' ')]), sentences)
答案 3 :(得分:0)
Binary Seach针对太长的句子和术语进行了更优化。
from bisect import bisect
def binary_search(a,x,lo=0,hi=-1):
i = bisect(a,x,lo,hi)
if i == 0:
return -1
elif a[i-1] == x:
return i-1
else:
return -1
sentences = ['the cat slept', 'the dog jumped', 'the bird flew', 'the a']
terms = ['clock', 'dog']
sentences_with_sorted = [(sentence, sorted(sentence.split()))
for sentence in sentences] # sort them for binary search
valid_sentences = []
for sentence in sentences_with_sorted:
list_of_word = sentence[1] # get sorted word list
if all([1 if binary_search(list_of_word, word)<0 else 0
for word in terms]): # find no word found
valid_sentences.append(sentence[0]) # append them
print valid_sentences