Question

我的文字中包含大量\n和|。这是一个示例：

this is a sample\n text. This symbol | shows what I am talking about.
This is \n another | sample

我希望能够提取\n和|之间的所有内容。对于上面的示例，这是：text. This symbol以及another 我怎么能在Python 2.7中做到这一点？

Answer 1

使用捕获组。

re.findall(r'\n([^|]*)\|', string)

[^|]*匹配任何字符，但不匹配|符号，零次或多次。默认情况下，re.findall打印捕获组中存在的字符。所以它打印出中间的字符。 |是正则表达式中的一个特殊元字符，其作用类似于交替运算符。要匹配文字|符号，您必须在正则表达式中将其转义。

Answer 2

您可以使用：

s='this is a sample\n text. This symbol | shows what I am talking about.\nThis is \n another | sample'

>>> print re.findall(r'\n([^|\n]*)\|', s);
[' text. This symbol ', ' another ']

此正则表达式捕获文字\n，后跟一个否定模式，表示：

([^|\n]*)表示匹配0或更多任何非管道或换行符的字符。方括号用于在组中捕获它，稍后将在findall输出中打印。它最终匹配文字|。

或者使用前瞻：

>>> print re.findall(r'(?<=\n )[^|\n]*(?= +\|)', s);
['text. This symbol', 'another']

(?<=\n )是一个lookbehind，意味着匹配应该以换行和空格开头
(?= +\|)是一个前瞻，意味着匹配后应跟一个空格和管道。

Python正则表达式匹配\ n和|之间的字符串

2 个答案: