我有一个看起来像这样的字符串:
1.1 Title: title1
line1
line2
line3
1.2 Title: Title2
line1
line2
line3
是否有正则表达式匹配每个块以1.x标题开头?我的所有试验都只给了我第一行或所有文件
感谢您的帮助
编辑:输出将是一个字符串列表,在这种情况下:
s1 = '1.1 Title: title1
line1
line2
line3'
和
s2 = '1.2 Title: title2
line1
line2
line3'
并且行数未知,' block'
的数量答案 0 :(得分:1)
如果您的行始终一致,则可以使用以下内容:
matches = re.findall(r'(?s)(1\.\d+\s+Title:(?:(?!\n1\.\d).)+)', s)
或者您可以拆分这些行:
matches = re.split(r'(?m)\s+(?=^1\.\d)', s)
答案 1 :(得分:0)
"(^\d.\d[^\n]+\d(?:\D+\d)+?(?=\n\d.\d))|(^\d.\d[^\n]+\d(?:\D+\d)+$)"gms
就是我想出来的。它分别捕获每个组,但它不是很漂亮。
来自Regex101.com的解释:
"(^\d.\d[^\n]+\d(?:\D+\d)+?(?=\n\d.\d))|(^\d.\d[^\n]+\d(?:\D+\d)+$)"gms
1st Alternative: (^\d.\d[^\n]+\d(?:\D+\d)+?(?=\n\d.\d))
1st Capturing group (^\d.\d[^\n]+\d(?:\D+\d)+?(?=\n\d.\d))
^ assert position at start of a line
\d match a digit [0-9]
. matches any character
\d match a digit [0-9]
[^\n]+ match a single character not present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\n matches a fine-feed (newline) character (ASCII 10)
\d match a digit [0-9]
(?:\D+\d)+? Non-capturing group
Quantifier: Between one and unlimited times, as few times as possible, expanding as needed [lazy]
\D+ match any character that is not a digit [^0-9]
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\d match a digit [0-9]
(?=\n\d.\d) Positive Lookahead - Assert that the regex below can be matched
\n matches a fine-feed (newline) character (ASCII 10)
\d match a digit [0-9]
. matches any character
\d match a digit [0-9]
2nd Alternative: (^\d.\d[^\n]+\d(?:\D+\d)+$)
2nd Capturing group (^\d.\d[^\n]+\d(?:\D+\d)+$)
^ assert position at start of a line
\d match a digit [0-9]
. matches any character
\d match a digit [0-9]
[^\n]+ match a single character not present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\n matches a fine-feed (newline) character (ASCII 10)
\d match a digit [0-9]
(?:\D+\d)+ Non-capturing group
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\D+ match any character that is not a digit [^0-9]
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\d match a digit [0-9]
$ assert position at end of a line
g modifier: global. All matches (do not return on first match)
m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
s modifier: single line. Dot matches newline characters