Question

我想设置一个模式，它会找到第一次出现的“边界”限制的捕获组。但现在使用了最后一个边界。

E.g：

String text = "this should match from A to the first B and not 2nd B, got that?";
Pattern ptrn = Pattern.compile("\\b(A.*B)\\b");
Matcher mtchr = ptrn.matcher(text);
while(mtchr.find()) {
    String match = mtchr.group();
    System.out.println("Match = <" + match + ">");
}

打印：

"Match = <A to the first B and not 2nd B>"

我希望它打印出来：

"Match = <A to the first B>"

我需要在模式中更改什么？

Answer 1

使用*让您的*? 非贪婪 / 不情愿：

Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");

默认情况下，模式会表现得很贪婪，并且匹配尽可能多的字符以满足模式，即直到最后 B 。

查看来自the docs和this tutorial的不情愿的量词。

Answer 2

不要使用贪婪表达式进行匹配，即：

Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");

Answer 3

*是贪婪的量词，它匹配尽可能多的字符以满足模式。直到示例中的最后B次出现。这就是为什么你需要使用不情愿的：*?，这将尽可能少的字符。所以，你的模式应该稍微改变一下：

Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");

请参阅the docs和this tutorial中的“不情愿的量词”。

Answer 4

或许比让*不情愿/懒惰更明确的是说你正在寻找A，然后是一堆不是B 的东西，其次是由B：

Pattern ptrn = Pattern.compile("\\b(A[^B]*B)\\b");

Java Regex模式匹配在任何字符序列之后首次出现“边界”

4 个答案: