Question

以下代码的输出：

rpl = 'This is a nicely escaped newline \\n'
my_string = 'I hope this apple is replaced with a nicely escaped string'
reg = re.compile('apple')
reg.sub( rpl, my_string )

..是：

'I hope this This is a nicely escaped newline \n is replaced with a nicely escaped string'

..所以打印时：

我希望这是一个很好的转义

替换为一个很好的转义字符串

所以python在替换另一个字符串中的'apple'时会取消字符串吗？现在我刚刚完成了

reg.sub( rpl.replace('\\','\\\\'), my_string )

这样安全吗？有没有办法阻止Python这样做？

Answer 1

来自help(re.sub) [强调我的]：

sub（pattern，repl，string，count = 0，flags = 0）


返回通过替换最左边获得的字符串         由字符串中的模式非重叠出现         替换代表 repl可以是字符串也可以是可调用的;         如果处理了字符串，反斜杠转义。如果是         一个可调用的，它传递了匹配对象，必须返回         要使用的替换字符串。

解决此问题的一种方法是传递lambda：

>>> reg.sub(rpl, my_string )
'I hope this This is a nicely escaped newline \n is replaced with a nicely escaped string'
>>> reg.sub(lambda x: rpl, my_string )
'I hope this This is a nicely escaped newline \\n is replaced with a nicely escaped string'

Answer 2

用于Python的re模块的所有正则表达式模式都未转义，包括搜索和替换模式。这就是r修饰符通常与Python中的正则表达式模式一起使用的原因，因为它减少了编写可用模式所需的“反向搜索”量。

r修饰符出现在字符串常量之前，并且基本上逐字地生成所有\个字符（字符串分隔符之前的字符除外）。所以，r'\\' == '\\\\'和r'\n' == '\\n'。

将您的示例编写为

rpl = r'This is a nicely escaped newline \\n'
my_string = 'I hope this apple is replaced with a nicely escaped string'
reg = re.compile(r'apple')
reg.sub( rpl, my_string )

按预期工作。

Python正在使用regex替换中的字符串

2 个答案: