问题陈述:
如果句点后面紧跟一个字母,则在一段时间后插入一个额外的空格。
以下是代码:
string="This is very funny and cool.Indeed!"
re.sub("\.[a-zA-Z]", ". ", string)
并输出:
"This is very funny and cool. ndeed!"
它正在替换'.'
之后的第一个字符。
对此有何解决方案?
答案 0 :(得分:3)
您可以使用不占用匹配部分的positivie lookahead assertion:
>>> re.sub(r"\.(?=[a-zA-Z])", ". ", string)
'This is very funny and cool. Indeed!'
使用capturing group and backreference替代方案:
>>> re.sub(r"\.([a-zA-Z])", r". \1", string) # NOTE - r"raw string literal"
'This is very funny and cool. Indeed!'
仅供参考,您可以使用\S
代替[a-zA-Z]来匹配非空格字符。
答案 1 :(得分:0)
您还可以在正则表达式中同时使用lookahead and lookbehind。
>>> import re
>>> string="This is very funny and cool.Indeed!"
>>> re.sub(r'(?<=\.)(?=[A-Za-z])', r' ', string)
'This is very funny and cool. Indeed!'
OR
您可以使用\b
,
>>> re.sub(r'(?<=\.)\b(?=[A-Za-z])', r' ', string)
'This is very funny and cool. Indeed!'
<强>解释强>
(?<=\.)
只需照看文字点。(?=[A-Za-z])
断言匹配的边界后面必须跟一个字母。