Question

我有一个包含以AB开头的段落的文件，我想得到所有这些段落，我使用了以下代码，但它没有返回任何内容：

import re
paragraphs = re.findall(r'AB[.\n]+AD',text) #AD is the beginning of the next paragraph

知道为什么这不起作用？

由于

Answer 1

尝试：

re.findall(r'AB.+?(?=AD)', text, re.DOTALL)

re.DOTALL标志将让点覆盖所有包含换行符的内容。 (?=AD)将匹配AD之前的最后一个字符，但不会将AD包含在匹配的字符串中。

然后，您可以rstrip()生成的字符串，从最后删除所有换行符。

Answer 2

来自python re模块文档：

[] 
    Used to indicate a set of characters. Characters can be listed individually, 
    or a range of characters can be indicated by giving two characters and 
    separating them by a '-'. Special characters are not active inside sets.

这意味着括号内的.匹配一个点，而不是正则表达式中的任何其他字符。

匹配段落以某个字母开头

2 个答案: