例如:
item =['the dog is gone', 'the dog and cat is gone']
words= ['dog','cat']
我希望能够过滤掉dog
和cat
,以便阅读:
item=['the is gone', 'the and is gone']
item1=[]
for w in words:
for line in item:
if w in line:
j=gg.replace(it,'')
item1.append(j)
我得到以下内容:
['the is gone', 'the cat and is gone', 'the and dog is gone']
答案 0 :(得分:5)
您循环遍历每个单词的所有行并附加替换。你应该切换这些循环:
item1 = []
for line in item:
for w in words:
line = line.replace(w, '')
item1.append(line)
注意:我改变了一些代码
gg
更改为line
it
更改为item
line
包含w
的检查,因为replace
处理了replace
import re
item1 = []
for line in item:
for w in words:
line = re.sub(r'\b%s\b' % w, '', line) # '\b' is a word boundry
item1.append(line)
不了解单词边界。如果您只想删除整个单词,您应该尝试不同的方法。使用re.sub
{{1}}
答案 1 :(得分:2)
您可以改用此方法:
item =['the dog is gone', 'the dog and cat is gone']
words= ['dog','cat']
item2 = [" ".join([w for w in t.split() if not w in words]) for t in item]
print item2
>>> ['the is gone', 'the and is gone']