所以我有一个标题,我可以开始匹配文本,然后至于该部分的结尾,我使用标题的反向引用,以确定一个部分的结束:
示例数据:
Section 1
sub-header here:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam sed interdum erat. Donec sed felis sit amet sem mattis aliquet non in turpis.
sub-section with one newline above
option A
option B
sub-section 2 with two newline above
setting1: value of setting1
setting2: value of setting2
Section 2
sub-header here:
Nulla maximus mollis urna, in lobortis est auctor a. Ut erat enim, volutpat id tortor eget, elementum fermentum nisi.
sub-section with one newline above
option A
option B
sub-section 2 with two newline above
setting1: value of setting1
setting2: value of setting2
Section 3
sub-header here:
Sed suscipit eleifend arcu fringilla pulvinar. Maecenas ullamcorper efficitur fringilla.
sub-section with one newline above
option A
option B
sub-section 2 with two newline above
setting1: value of setting1
setting2: value of setting2
我的正则表达式如下:
(?:^|\n)((Section\s*)(\d+))$([\s\S]*?)(?=\2)
这匹配前两个部分,但不匹配最后一部分。
答案 0 :(得分:1)
试试这个正则表达式:
(Section\s*\d+)([\s\S]*?)(?=\s*Section\s*\d+|$)
<强>解释强>
(Section\s*\d+)
- 匹配文本Section
,后跟0 +空格,后跟1位以上的数字,并在第1组中捕获整个内容([\s\S]*?)
- 匹配任何字符的0次出现并在第2组中捕获它(?=\s*Section\s*\d+|$)
- 积极向前看,以确保上面匹配的内容必须跟在字符串的结尾或0 +空格后跟Section
后跟0 +空格后跟1+位数