我想要的效果:如果在x
之前找不到y
,则会失败。
import re
a = '''START aaaadkdklfje VALUE aaaadkdklfjeaaaadkdklfjeaaaadkdklfje aaaadkdklfjeaaaadkdklfjeaaaadkdklfjeaaaadkdklfjeaaaadkdklfjeaaaadkdklfje aaaadkdklfjeaaaadkdklfje aaaadkdklfje
aaaadkdklfje
aaaadkdklfje condition a
aaaadkdklfje
aaaadkdklfje
aaaadkdklfje condition b
aaaadkdklfje z
aaaadkdklfjeaaaadkdklfje aaaadkdklfjeqqqsdddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddfsdfsdf
condition c
???kjij
START...'''
b = re.findall(r'START condition a (VALUE).+?condition b.+?condition c(?!START)', a, re.DOTALL)
if b:
for x in b:
print x
我想仅在文本块中存在value
时才捕获condition
。没有匹配过去的下一个start
。
这是唯一应匹配的案例:
start
?, value, ?, condition a, ?, condition b, ?, condition c # i want the matching to be done only in here
start
...
不是这个:
start
?, value, condition a, ?
start
?, value, ?, condition b, condition c
start
答案 0 :(得分:2)
另一种方法是使用几个步骤:
blocks = re.split(r'\bSTART\b', s)
blocks = filter(lambda x: re.search(r'condition a.*?condition b.*?condition c', x), blocks[1:])
blocks = map(lambda x: 'START'+x, blocks)
注意:如果您希望条件位于关键字VALUE
之后,请在搜索模式的开头添加\bVALUE\b.*?
。
答案 1 :(得分:1)
您可以合并多个lookarounds,以便不跳过START
并维护条件序列:
(?s)START(?:(?!START|condition).)*?\b(VALUE)(?=(?:(?!START).)*?condition a(?:(?!START).)*?condition b(?:(?!START).)*?condition c)
Test at regex101但请注意,这是非常糟糕的表现:]
这允许condition a condition a condition b condition c
。要创建独家条件,请将condition a(?:(?!START).)*?
和b c部分更改为condition a(?:(?!START|condition).)*?
...