Question

为什么需要为python正则表达式添加DOTALL标志以匹配包括原始字符串中的新行字符的字符。我问，因为原始字符串应该忽略特殊字符（如新行字符）的转义。来自文档：

解决方案是使用Python的原始字符串表示法来表示正则表达式模式;在前缀为'r'的字符串文字中，不会以任何特殊方式处理反斜杠。所以r“\ n”是包含'\'和'n'的双字符字符串，而“\ n”是包含换行符的单字符字符串。

这是我的情况：

string = '\nSubject sentence is:  Appropriate support for families of children diagnosed with hearing impairment\nCausal Verb is :  may have\npredicate sentence is:  a direct impact on the success of early hearing detection and intervention programs in reducing the negative effects of permanent hearing loss'

re.search(r"Subject sentence is:(.*)Causal Verb is :(.*)predicate sentence is:(.*)", string ,re.DOTALL)

导致匹配，但是，当我删除DOTALL标志时，我得不到匹配。

Answer 1

您的源字符串不是 raw，只有您的模式字符串。

也许试试

string = r'\n...\n'
re.search("Subject sentence is:(.*)Causal Verb is :(.*)predicate sentence is:(.*)", string)

Answer 2

正则表达式.表示any character except \n

因此，如果您的字符串中有换行符，则.*将不会传递该换行符（\n）。

但是在Python中，如果您使用re.DOTALL标记（也称为re.S），那么它包含\n（换行符）和该点.

为什么我需要将DOTALL添加到python正则表达式以匹配原始字符串中的新行

2 个答案: