Question

我在使用re.sub。时遇到了麻烦。我从其他答案中了解到，这是因为我引用了一个我没有的捕获组。

我的问题是：如何调整代码以拥有有效的群组？

s = "hello a world today b is sunny c day"
markers = "a b c".split()
pattern = r'\b' + ' (?:\w+ )?(?:\w+ )?'.join(markers) + r'\b'
text = re.sub(pattern, r'<b>\1</b>', s)   # this gives error

我希望这样："hello <b>a world today b is sunny c</b> day"

Answer 1

如果模式中没有捕获组，则无法使用\1替换反向引用。将捕获组添加到模式：

pattern = r'\b(' + ' (?:\w+ )?(?:\w+ )?'.join(markers) + r')\b' # or
              ^                                            ^
pattern = r'\b({})\b'.format(r' (?:\w+ )?(?:\w+ )?'.join(markers))

或者，只需使用\g<0>插入整个匹配而不是捕获组值（然后，不需要修改正则表达式）：

text = re.sub(pattern, r'<b>\g<0></b>', s)

请参阅Python demo。

Answer 2

你的正则表达中没有任何组。

(?:...)是一个非捕获组，我想你想要

pattern = r'\b(' + ' (?:\w+ )?(?:\w+ )?'.join(markers) + r')\b'

Answer 3

这段代码可以得到你想要的结果。我已经测试了。

import re
s = "hello a world today b is sunny c day"
pat = r'(a.*c)'
result = re.sub(pat,r'<b>\1</b>',s)

使用re.sub（）时无效的组引用

3 个答案: