以下是我想解析的字符串
a=' //TS_START
/*TG_HEADER_START
title="XYX"
ident=""
*/
/*
<TC_HEADER_START>
title=" Halted after Tester Connect"
ident="TC1"
variants="A C"
name="TC">
TestcaseDescription= This >
TestcaseRequirements=36978
StakeholderRequirements=1236
TestcaseParameters:
TS_Implemented=Yes;
TS_Automation=Automated;
TS_Techniques= Testing;
TS_Priority=1;
TS_Tested_By=qz9ghv;
TS_Review_done=Yes;
TS_Regression=No
TestcaseTestType=Test
</TC_HEADER_END>
<TC_HEADER_START>
title=" Halted after Tester Connect"
ident="TC1"
variants="A C"
name="TC">
TestcaseDescription= This >
TestcaseRequirements=36978
StakeholderRequirements=1236
TestcaseParameters:
TS_Implemented=Yes;
TS_Automation=Automated;
TS_Techniques= Testing;
TS_Priority=1;
TS_Tested_By=qz9ghv;
TS_Review_done=Yes;
TS_Regression=No
TestcaseTestType=Test
</TC_HEADER_END>
*/
testcase TC_GEEA2_VGM_DOIP_01(char strDescription[], char strReq[], char strParams[])
{
}
/*TG_HEADER_END*/
zd.a.S,D.,AS'
A/S,D/.A.SD./
//<TS_END>'
我喜欢解析该字符串并获取一个字符串列表,该列表以<TC_HEADER_START>
开始,以</TC_HEADER_END>
结尾。我尝试编写以下匹配所有而不是第一个匹配的正则表达式。
aa=re.findall(r'<TC_HEADER_START>([\s\S]*)</TC_HEADER_END>',a)
预期产量
aa=['<TC_HEADER_START>
title=" Halted after Tester Connect"
ident="TC1"
variants="A C"
name="TC">
TestcaseDescription= This >
TestcaseRequirements=36978
StakeholderRequirements=1236
TestcaseParameters:
TS_Implemented=Yes;
TS_Automation=Automated;
TS_Techniques= Testing;
TS_Priority=1;
TS_Tested_By=qz9ghv;
TS_Review_done=Yes;
TS_Regression=No
TestcaseTestType=Test
</TC_HEADER_END>','<TC_HEADER_START>
title=" Halted after Tester Connect"
ident="TC1"
variants="A C"
name="TC">
TestcaseDescription= This >
TestcaseRequirements=36978
StakeholderRequirements=1236
TestcaseParameters:
TS_Implemented=Yes;
TS_Automation=Automated;
TS_Techniques= Testing;
TS_Priority=1;
TS_Tested_By=qz9ghv;
TS_Review_done=Yes;
TS_Regression=No
TestcaseTestType=Test
</TC_HEADER_END>']
答案 0 :(得分:1)
您的正则表达式几乎是正确的-您想使用惰性量词(*?
)而不是贪婪的量词(*
)。
尝试一下:
<TC_HEADER_START>([\s\S]*?)</TC_HEADER_END>
或在regex101上尝试
如果要包括封闭标签,也将它们包装到捕获组中:
(<TC_HEADER_START>)([\s\S]*?)(</TC_HEADER_END>)
答案 1 :(得分:0)
re.M,re.S _> https://docs.python.org/3/library/re.html?highlight=re.S#re.MULTILINE
import re
aa=re.findall(r'<TC_HEADER_START>(.*?)</TC_HEADER_END>',a,re.S)
print(len(aa))
print(aa[0])
输出:
2
title=" Halted after Tester Connect"
ident="TC1"
variants="A C"
name="TC">
TestcaseDescription= This >
TestcaseRequirements=36978
StakeholderRequirements=1236
TestcaseParameters:
TS_Implemented=Yes;
TS_Automation=Automated;
TS_Techniques= Testing;
TS_Priority=1;
TS_Tested_By=qz9ghv;
TS_Review_done=Yes;
TS_Regression=No
TestcaseTestType=Test