如何过滤掉python中的单词?

时间:2012-12-01 06:05:59

标签: python string sorting

例如:

item =['the dog is gone', 'the dog and cat is gone']
words= ['dog','cat'] 

我希望能够过滤掉dogcat,以便阅读:

item=['the  is gone', 'the   and  is gone']

item1=[] 
for w in words:
   for line in item:
      if w in line:
         j=gg.replace(it,'')
         item1.append(j)

我得到以下内容:

['the  is gone', 'the cat and  is gone', 'the  and dog is gone']

2 个答案:

答案 0 :(得分:5)

您循环遍历每个单词的所有行并附加替换。你应该切换这些循环:

item1 = [] 
for line in item:
    for w in words:
        line = line.replace(w, '')
    item1.append(line)

注意:我改变了一些代码

  • gg更改为line
  • it更改为item
  • 删除了line包含w的检查,因为replace处理了replace

import re item1 = [] for line in item: for w in words: line = re.sub(r'\b%s\b' % w, '', line) # '\b' is a word boundry item1.append(line) 不了解单词边界。如果您只想删除整个单词,您应该尝试不同的方法。使用re.sub

{{1}}

答案 1 :(得分:2)

您可以改用此方法:

item =['the dog is gone', 'the dog and cat is gone']
words= ['dog','cat'] 

item2 = [" ".join([w for w in t.split() if not w in words]) for t in item]

print item2

>>> ['the is gone', 'the and is gone']