Question

我试图制作两个彼此独占的正则表达式模式，用于Java。

正则表达式：

匹配[[和]]之间的任何内容，除了|（管道）字符或\r\n换行符，捕获完整匹配且仅显示{{1}之间的文本}和[[分别分为1组和2组：

正则表达式：]]

应使用以下输入返回2个完整匹配：

(\[\[([^|\r\n]*?)]])

正则表达式二：

匹配[[wikipage&sfd/weird-]] [[The whitespace.con_vention/+-test]]和[[之间的所有内容，但]]（管道）字符或|换行字符除外，捕获完整匹配，只捕获\r\n之前的文本1}}分别进入第1组和第2组：

正则表达式：|

应使用以下输入返回2个完整匹配：

(\[\[([^|\r\n]*?)\|.*?]])

但输入如下：

[[SandBox|the sandbox]] [[SandBox|the.sandbox_/=test]]

正则表达式2 的首次完整匹配是：

Test [[wiki:1]], [[wiki:page]test|test ]one]], [[wiki:1|page one]]

正则表达式二的第二次完整匹配是：

[[wiki:1]], [[wiki:page]test|test ]one]]

虽然我预计只会有两场完整的比赛：

[[wiki:1|page one]]

[[wiki:page]test|test ]one]]

我尝试过这样的负面预测：[[wiki:1|page one]]但如果找到(\[\[([^|\r\n]*?)(?!]])\|.*?]])，它就会回溯并重复使用正则表达式的第一部分。

所以我的问题是： 如果在]]之前找到]]，如何跳过/取消整个正则表达式？

Answer 1

如果您不希望在第一个]]之前允许|，则可以使用negative lookahead assertion来实现此目的：

\[\[((?:(?!]])[^|\r\n])*)\|.*?]]

<强>解释

\[\[        # Match [[
(           # Match and capture in group 1...
 (?:        # (Start of non-capturing group)
  (?!]])    # ... (unless the text "]]" is directly ahead)
  [^|\r\n]  # ... any character except pipes or newlines
 )*         # repeat as necessary (lazy evaluation is not needed here)
)           # End of capturing group
\|          # Match a pipe
.*?         # and any number of characters until
]]          # the next instance of ]]

见live on regex101.com。

如果在模式结束之前找到某些内容，我如何跳过/取消匹配的正则表达式？

1 个答案: