Question

原始数据：

# Case 1
1980 (reprint 1987)

# Case 2
1980 (1987 reprint)

捕获组：

{
    publish: 1980,
    reprint: 1987
}

要求：

匹配＆＃34;重印＆＃34;因此将其年度部分视为转载。
第一个匹配年份始终是发布年份。

当前方法：

# Match 2nd case but not the 1st case.
(?P<publish>\d{4}).*(?P<reprint>\d{4}(?=\sreprint.*))

# Match 1st case but not the 2nd case
(?P<publish>\d{4}).*(?<=reprint\s)(?P<reprint>\d{4})

我不确定如何合并上面的2个正则表达式。所以我必须迭代匹配两次。或者，如果我们能够在一个正则表达式下匹配两者，那就更好了。

Answer 1

您可以使用此单一正则表达式进行更改。 reprint组如果后跟\sreprint（由正面预测断言）或者如果前面有reprint\s（由正面观察断言），则会匹配。

(?P<publish>\d{4}).*?(?P<reprint>(?:\d{4}(?=\sreprint)|(?<=reprint\s)\d{4}))

RegEx Demo

Answer 2

也许只是：

(?P<publish>\d{4}).*(?:reprint )?(?P<reprint>\d{4})(?: reprint)?

https://regex101.com/r/lX7hK5/1

这将假设重印可以在日期之前或之后出现，但您的原始数据表明它只能在一个地方，因此它可以工作（例如1980 (reprint 1987 reprint)）。

Answer 3

您只需指定括号并混合两个正则表达式：

images

演示：

r'(?P<publish>\d{4})\s\(.*(?P<reprint>\d{4}).*\)

如果括号内存在>>> [i.groupdict() for i in re.finditer(r'(?P<publish>\d{4})\s\(.*(?P<reprint>\d{4}).*\)', s)] [{'reprint': '1987', 'publish': '1980'}, {'reprint': '1987', 'publish': '1980'}]，您可以使用正向前瞻来强制执行：

reprint

如何匹配一个正则表达式下的两种模式？

3 个答案: