Question

我搜索了与Java正则表达式相关的问题，并找到了有关模式和匹配器类的信息，以便为您提供围绕注册表的匹配条件的文本组。

但是，我的要求是不同的。我希望提取正则表达式所代表的实际文本。

示例：

Input text: ABC 22. XYZ
Regular expression: (.*) [0-9]* (.*)

使用Pattern和Matcher类（或Java中的任何其他方法），我如何获得文本＆＃34; 22.＆＃34;？这是正则表达式所代表的文本。

Answer 1

您可以尝试以下正则表达式¹：

.*?(\s*\d+\.\s+).*

使用一些图形工具²，您可以看到正则表达式中的组在哪里，即：

要在Java中提取该组，请执行以下操作：

String input = "ABC 22. XYZ";

System.out.println(
    input.replaceAll(".*?(\\s*\\d+\\.\\s+).*", "$1")
);  // prints " 22. "

$1取代group #1。

备注的

正则表达式的说明：

NODE         EXPLANATION
------------------------------------------------------------------
  .*?        any character except \n (0 or more times
             (matching the least amount possible))
------------------------------------------------------------------
  (          group and capture to \1:
------------------------------------------------------------------
    \s*        whitespace (\n, \r, \t, \f, and " ") (0
               or more times (matching the most amount
               possible))
------------------------------------------------------------------
    \d+        digits (0-9) (1 or more times (matching
               the most amount possible))
------------------------------------------------------------------
    \.         '.'
------------------------------------------------------------------
    \s+        whitespace (\n, \r, \t, \f, and " ") (1
               or more times (matching the most amount
               possible))
------------------------------------------------------------------
  )          end of \1
------------------------------------------------------------------
  .*         any character except \n (0 or more times
             (matching the most amount possible))

获取屏幕截图的工具是Regexper。

Answer 2

捕获群组已关闭。

Pattern p = Pattern.compile ("(\\d+\\.?)");
Matcher m = p.matcher ("ABC 22. XYZ");
if (m.find ()) {
  System.out.println  (m.group (1));
}

使用(和)定义捕获组，您可以稍后通过组索引从匹配器中检索。组0总是匹配。

Answer 3

你的输入在“22”之后有一个点，但你的正则表达式没有考虑到这一点。

如果您的输入中只有一个数字，您可以像这样提取它：

String number = input.replaceAll(".*?(\\d+).*", "$1");

此正则表达式匹配输入中任何位置的（第一个）数字（任意长度），无论输入的其余部分是什么。

使用reg-ex提取匹配的字符串

3 个答案: