Question

我有条短信。我要删除某些单词和短语。

一句话是：我们在1990年代[/ b]年代住在那里。

我对其进行搜索以找到约会。（=单词[0]）

newline = re.sub（'ate'，newselectionString，line）

但是我只希望它自己找到 ate ，而不是另一个词的一部分。

是否可以告诉re仅找到这3个字母？

稍后的文字是：最好的事情是当我们吃冰淇淋时。

for line in lines:
        for i in range(0, len(words)):
            if words[i] in line:
                print('Found ' + words[i])
                newselectionString = selectionString.replace('GX', 'G' + str(startInt))
                newline = re.sub(words[i], newselectionString, line)
                newLines.append(newline)
                startInt +=1

Answer 1

有两种方法可以做到：

正则表达式

所需的正则表达式为\bate\b，或者ate应该出现在两个单词边界之间。它将匹配We ate.，I ate it.，但不匹配We're late.。

分割字符串

与普通正则表达式非常相似，但是您可能希望控制句子中的其他单词。

word_fragments = re.split("\b", your_string)

print(' '.join([word for word in word_fragments if word != 'ate']))

Answer 2

在\b和str.format之间使用单词边界。

例如：

re.sub(r"\b{}\b".format(words[i]), "Hello World", Text)

使用re.sub替换

2 个答案:

正则表达式

分割字符串