Python RegEx删除新行(不应该在那里)

时间:2017-12-05 11:22:49

标签: python regex

我提取了一些文字并希望通过RegEx清理它。

我已经学习了基本的RegEx,但不知道如何构建这个:

str = '''
this is 
a line that has been cut.
This is a line that should start on a new line
'''

应转换为:

str = '''
this is a line that has been cut.
This is a line that should start on a new line
'''

r'\w\n\w'似乎抓住了它,但不确定如何用空格替换新行而不触及结尾和单词的开头

1 个答案:

答案 0 :(得分:3)

您可以将此lookbehind正则表达式用于re.sub

>>> str = '''
... this is
... a line that has been cut.
... This is a line that should start on a new line
... '''
>>> print re.sub(r'(?<!\.)\n', '', str)
this is a line that has been cut.
This is a line that should start on a new line
>>>

RegEx Demo

(?<!\.)\n匹配所有不带点的换行符。

如果您不想根据点的存在进行匹配,请使用:

re.sub(r'(?<=\w\s)\n', '', str)

RegEx Demo 2