从字符串中删除多个单词的更好方法是什么?

时间:2015-07-07 15:56:57

标签: python regex string python-3.x replace

bannedWord = ['Good','Bad','Ugly']

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    for x in range(0,len(database)):
        if bannedWord[x] in statement:
            statement = statement.replace(bannedWord[x]+' ','')
    return statement

toPrint = 'Hello Ugly Guy, Good To See You.'

print RemoveBannedWords(toPrint,bannedWord)

输出为Hello Guy, To See You.了解Python我觉得有更好的方法来实现更改字符串中的多个单词。我使用字典搜索了一些类似的解决方案,但它似乎不适合这种情况。

5 个答案:

答案 0 :(得分:10)

我用

bannedWord = ['Good','Bad','Ugly']
toPrint = 'Hello Ugly Guy, Good To See You.'
print ' '.join(i for i in toPrint.split() if i not in bannedWord)

答案 1 :(得分:6)

这是一个使用正则表达式的解决方案:

import re

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    pattern = re.compile("\\b(Good|Bad|Ugly)\\W", re.I)
    return pattern.sub("", toPrint)

toPrint = 'Hello Ugly Guy, Good To See You.'

print RemoveBannedWords(toPrint,bannedWord)

答案 2 :(得分:0)

主题的又一个变体。如果你要打电话给这个,那么最好先编译一次正则表达式:

import re

bannedWord = ['Good','Bad','Ugly']
re_banned_words = re.compile(r"\b(" + "|".join(bannedWord) + ")\\W", re.I)

def RemoveBannedWords(toPrint):
    global re_banned_words
    return re_banned_words.sub("", toPrint)

toPrint = 'Hello Ugly Guy, Good To See You.'
print RemoveBannedWords(toPrint)

答案 3 :(得分:0)

Ajay的代码略有变化,当其中一个字符串是bannedWord列表中其他字符串的子字符串时

bannedWord = ['good', 'bad', 'good guy' 'ugly']

toPrint ='good winter good guy'的结果将是

RemoveBannedWords(toPrint,database = bannedWord) = 'winter good'

因为它会首先删除good。需要对列表中的元素长度进行排序。

import re

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    database_1 = sorted(list(database), key=len)
    pattern = re.compile(r"\b(" + "|".join(database_1) + ")\\W", re.I)
    return pattern.sub("", toPrint + ' ')[:-1] #added because it skipped last word

toPrint = 'good winter good guy.'

print(RemoveBannedWords(toPrint,bannedWord))

答案 4 :(得分:0)

当您检查开头的单词边界和结尾的非单词字符时,最好使用正则表达式。 仍然可以使用内存中的数组/列表

bannedWord = ['Good', 'Bad', 'Ugly']

toPrint = 'Hello Uglyyy Guy, Good To See You.'

for word in bannedWord:
    toPrint = toPrint.replace(word, "")

print(toPrint) 
Hello yy Guy,  To See You.

[Program finished]