我知道这个问题在此之前已经得到了解答Case insensitive replace但我的有点不同。
我想要的是在文本中搜索某些关键字,并将其替换为<b>
和</b>
。通过以下示例解释了四种不同的可能性:
关键字 = ['hell', 'world']
输入句子 = 'Hell is a wonderful place to say hello and sell shells'
预期输出1 = '<b>Hell</b> is a wonderful place to say hello and sell shells'
- (未被关键字&#39; hell&#39;但找到的单词&#39取代39;地狱&#39;。只有完整的比赛被替换。)
预期输出2 = '<b>Hell</b> is a wonderful place to say <b>hello</b> and sell shells'
- (仅替换以关键字开头的匹配单词。请注意,整个单词正在获取即使匹配是部分)也会替换
预期输出3 = '<b>Hell</b> is a wonderful place to say <b>hello</b> and sell <b>shells</b>'
- (任何地狱的出现都会被替换,但需要完整的匹配词)< / p>
预期输出4 = '<b>Hell</b> is a wonderful place to say <b>hell</b>o and sell s<b>hell</b>s'
- (任何地狱的出现都会被替换,但不会被完整匹配的单词替换。匹配的单词保持不变)
链接的SO问题,用找不到我想要的关键字替换单词。我想保持输入句子的大小写完整。有人可以帮我找到上述四种情况的解决方案吗?
我尝试过的代码:
import re
insensitive_hippo = re.compile(re.escape('hell'), re.IGNORECASE)
insensitive_hippo.sub('hell', 'Hell is a wonderful place to say hello and sell shells')
'hell is a wonderful place to say hello and sell shells'
但这并不能保持找到的单词完好无损。
答案 0 :(得分:2)
print re.sub(r"\b(hell)\b",r"<b>\1</b>",x,flags=re.I)
print re.sub(r"\b(hell\S*)",r"<b>\1</b>",x,flags=re.I)
print re.sub(r"\b(\S*hell\S*)",r"<b>\1</b>",x,flags=re.I)
print re.sub(r"(hell)",r"<b>\1</b>",x,flags=re.I)
输出:
<b>Hell</b> is a wonderful place to say hello and sell shells
<b>Hell</b> is a wonderful place to say <b>hello</b> and sell shells
<b>Hell</b> is a wonderful place to say <b>hello</b> and sell <b>shells</b>
<b>Hell</b> is a wonderful place to say <b>hell</b>o and sell s<b>hell</b>s