Python正则表达式多行匹配

时间:2019-07-28 02:33:22

标签: python regex

我必须在python中匹配多行。

group one start
line 1 data
group end
group two start
group two data
group end

在上面的字符串上如何获得下面的输出

[group one start \n line 1 data \n group end, group two start \n group two data \n group end]

我已经尝试了以下代码,但无法正常工作

import re 

re.findall(r'group.*start.*group end',re.MULTILINE | re.DOTALL)

for info in data:
   print info

3 个答案:

答案 0 :(得分:0)

也许,该表达式有点类似于:

\bgroup [\s\S]*? start\b[\s\S]*?\bgroup end\b

DEMO 1

或:

\bgroup .*? start\b.*?\bgroup end\b

DEMO 2

带有DOTALL标志的

可能在这里工作。

使用DOTALL测试:

import re

regex = r"\bgroup .*? start\b.*?\bgroup end\b"

test_str = """
group one start
line 1 data
group end
group two start
group two data
group end
"""

print(re.findall(regex, test_str, re.DOTALL))

不使用DOTALL进行测试:

import re

regex = r"(\bgroup [\s\S]*? start\b[\s\S]*?\bgroup end\b)"

test_str = """
group one start
line 1 data
group end
group two start
group two data
group end

"""


print(re.findall(regex, test_str))

输出

['group one start\nline 1 data\ngroup end', 'group two start\ngroup two data\ngroup end']

该表达式在regex101.com的右上角进行了解释,如果您想探索/简化/修改它,在this link中,您可以观察到它如何与某些示例输入匹配,如果你喜欢。

答案 1 :(得分:0)

您可以仅基于模式group end拆分文本,而无需使用后向捕获方式

>>> import re
>>> text_data = """group one start
... line 1 data
... group end
... group two start
... group two data
... group end"""
>>> 
>>> re.split(r'(?<=group end)\n', text_data)
['group one start\nline 1 data\ngroup end', 'group two start\ngroup two data\ngroup end']

答案 2 :(得分:0)

以下代码对我有用

a = """group one start
line 1 data
group end
group two start
group two data
group end
"""
all_m = re.findall(r'group.*?start.*?group end',a,re.DOTALL)
for m in all_m:
    print(m)
    print("**********")

输出

group one start
line 1 data
group end
*************
group two start
group two data
group end
*************