如何设计一个能捕获2个字符串之间所有字符的正则表达式? 具体来说,从这个大字符串:
Studies have shown that...[^title=Fish consumption and incidence of stroke: a meta-analysis of cohort studies]... Another experiment demonstrated that... [^title=The second title]
我想提取[^title=
和]
之间的所有字符,即Fish consumption and incidence of stroke: a meta-analysis of cohort studies
和The second title
。
我想我必须使用re.findall(),我可以从这开始:re.findall(r'\[([^]]*)\]', big_string)
,它会给我方括号[ ]
之间的所有匹配,但我'我不知道如何扩展它。
答案 0 :(得分:5)
>>> text = "Studies have shown that...[^title=Fish consumption and incidence of stroke: a meta-analysis of cohort studies]... Another experiment demonstrated that... [^title=The second title]"
>>> re.findall(r"\[\^title=(.*?)\]", text)
['Fish consumption and incidence of stroke: a meta-analysis of cohort studies', 'The second title']
以下是正则表达式的细分:
\[
是一个已转义的[字符。
\^
是一个转义的^字符。
title=
匹配title =
(.*?)
匹配任何字符,非贪婪,并将它们放在一个组中(对于findall来提取)。这意味着当它找到...时会停止...
\]
,这是一个转义字符。