Question

假设我有一个文本字符串，所有字符都是基于拉丁语的。标点符号。

如何“查找”所有字符并在其周围添加<strong>标记？

hay = The fox jumped up the tree.
needle = "umpe"

在这种情况下，“jumped”一词的一部分会突出显示。

Answer 1

没有正则表达式（可能更冗长，但也更容易理解）：

hay = "The fox jumped up the tree."
needle = "umpe"

print hay.replace(needle, "<strong>%s<strong>" % needle)

在额外规范之后编辑：如果你想要不区分大小写的替换（常规字符串替换不能这样做）：

import re

hay = "The fox jUMPed up the tree."
needle = "umpe"

regex = re.compile('(%s)' % needle, re.I)
print regex.sub('<strong>\\1</strong>', hay)

Answer 2

在像这样的简单搜索表达式上使用正则表达式是过度的。但是，如果您需要更复杂的搜索，我引用了Python's re module documentation来汇总下面的代码，我认为这样做是您想要的：

#!/usr/bin/env python
import re
haystack = "The fox jumped up the tree."
needle = "umpe"
new_text = "<strong>" + needle + "</strong>"
new_haystack = re.sub(needle, new_text, haystack)
print new_haystack

Answer 3

你的问题不是很清楚。如果你想突出显示有针的单词，你可以匹配

\b(\w*needle\w*)\b

并将其替换为

<strong>\1<strong>

Answer 4

在这种情况下不使用正则表达式，但适用于较小的字符串。

hay = "The fox jumped up the tree."
needle = "umpe"

hay_lower = hey.lower()
found = []
curr_find = hay_lower.find(needle.lower())
found.append(curr_find)
hay_list = list(hay)

while(curr_find):
    curr_find = hay_lower.find(needle.lower(), curr_find)

for found_index in found:
   hay_list[found_index:found_index+len(needle)] = '<strong>%s</strong>' % needle

result = ''.join(hay_list)

Answer 5

这应该有效：

pattern = r'(?P<needle>(umpe))'
pat_obj = re.compile(pattern)
new_text = pat_obj.sub(r'<strong>\g<needle></strong>', hay)

以HTML格式呈现的结果：狐狸j umpe d在树上。

在上面的代码片段中，我使用了re方法'sub'并引用了一个捕获的组（我称之为'needle'）。

我如何在Python中执行此正则表达式？

5 个答案: