替换所有特定单词的出现

时间:2014-09-02 20:19:44

标签: python regex python-2.7

假设我有以下句子:

bean likes to sell his beans

我希望用其他单词替换所有出现的特定单词。例如,beanrobertbeanscars

我不能只使用str.replace,因为在这种情况下,它会将beans更改为roberts

>>> "bean likes to sell his beans".replace("bean","robert")
'robert likes to sell his roberts'

我只需改变整个单词,而不是另一个单词中出现的单词。我认为我可以通过使用正则表达式实现这一点,但不知道如何正确执行。

4 个答案:

答案 0 :(得分:16)

如果您使用正则表达式,则可以使用\b指定字边界:

import re

sentence = 'bean likes to sell his beans'

sentence = re.sub(r'\bbean\b', 'robert', sentence)
# 'robert likes to sell his beans'

这里'豆类'不会改变(因为' roberts')因为' s'最后不是单词之间的边界:\b匹配空字符串,但在单词的开头或结尾。

完成性的第二次替换:

sentence = re.sub(r'\bbeans\b', 'cars', sentence)
# 'robert likes to sell his cars'

答案 1 :(得分:4)

如果您一次更换一个单词,您可能会多次替换单词(而不是得到您想要的单词)。为避免这种情况,您可以使用函数或lambda:

d = {'bean':'robert', 'beans':'cars'}
str_in = 'bean likes to sell his beans'
str_out = re.sub(r'\b(\w+)\b', lambda m:d.get(m.group(1), m.group(1)), str_in)

这样,一旦beanrobert替换,就不会再次修改(即使robert也在您的输入词汇列表中)。

根据 georg 的建议,我使用dict.get(key, default_value)编辑了此答案。 替代解决方案(也由 georg 建议):

str_out = re.sub(r'\b(%s)\b' % '|'.join(d.keys()), lambda m:d.get(m.group(1), m.group(1)), str_in)

答案 2 :(得分:-1)

"bean likes to sell his beans".replace("beans", "cars").replace("bean", "robert")

将所有“beans”实例替换为“car”,将“bean”替换为“robert”。这是有效的,因为.replace()返回原始字符串的修改实例。因此,您可以分阶段思考它。它基本上是这样工作的:

 >>> first_string = "bean likes to sell his beans"
 >>> second_string = first_string.replace("beans", "cars")
 >>> third_string = second_string.replace("bean", "robert")
 >>> print(first_string, second_string, third_string)

 ('bean likes to sell his beans', 'bean likes to sell his cars', 
  'robert likes to sell his cars')

答案 3 :(得分:-1)

我知道它已经很长时间了,但这看起来更优雅吗? :

reduce(lambda x,y : re.sub('\\b('+y[0]+')\\b',y[1],x) ,[("bean","robert"),("beans","cars")],"bean likes to sell his beans")