Question

问题陈述：

如果句点后面紧跟一个字母，则在一段时间后插入一个额外的空格。

以下是代码：

string="This   is  very funny  and    cool.Indeed!"

re.sub("\.[a-zA-Z]", ". ", string)

并输出：

"This is very funny and cool. ndeed!"

它正在替换'.'之后的第一个字符。

对此有何解决方案？

Answer 1

您可以使用不占用匹配部分的positivie lookahead assertion：

>>> re.sub(r"\.(?=[a-zA-Z])", ". ", string)
'This   is  very funny  and    cool. Indeed!'

使用capturing group and backreference替代方案：

>>> re.sub(r"\.([a-zA-Z])", r". \1", string)  # NOTE - r"raw string literal"
'This   is  very funny  and    cool. Indeed!'

仅供参考，您可以使用\S代替[a-zA-Z]来匹配非空格字符。

Answer 2

您还可以在正则表达式中同时使用lookahead and lookbehind。

>>> import re
>>> string="This   is  very funny  and    cool.Indeed!"
>>> re.sub(r'(?<=\.)(?=[A-Za-z])', r' ', string)
'This   is  very funny  and    cool. Indeed!'

OR

您可以使用\b，

>>> re.sub(r'(?<=\.)\b(?=[A-Za-z])', r' ', string)
'This   is  very funny  and    cool. Indeed!'

<强>解释

(?<=\.)只需照看文字点。
(?=[A-Za-z])断言匹配的边界后面必须跟一个字母。
如果是，则用空格替换边界。

使用re.sub（）输出错误

2 个答案: