如何在java正则表达式中使用嵌套的非捕获组

时间:2016-11-06 21:31:24

标签: java regex

我有一系列的行如下(可以按任意顺序排列)

Distal latency   4.9 N/A N/A 4.0 N/A N/A N/A N/A 6.3 4.4 N/A

 % failed Chicago Classification  70 1 1 0 1 1 1 1 0 0 1

 % panesophageal pressurization  0 0 0 0 0 0 0 0 0 0 0

 % premature contraction  20 0 0 1 0 0 0 0 0 1 0

 % rapid contraction  10 0 0 1 0 0 0 0 0 0 0

 % large breaks  10 0 0 0 0 0 0 0 1 0 0

 % small breaks  10 0 0 1 0 0 0 0 0 0 0

我想最终将行标题和每个值提取到Hash中,如下所示

Distallatency=4.9,Distallatency=N/A etc.
failedChicagoClassification1=70,failedChicagoClassification1=1,failedChicagoClassification1=1,failedChicagoClassification1=0,failedChicagoClassification1=1 etc.

and so on

我这样做的策略是:

1. join the words together by replacing the \s between words
2. End the joined word with a character eg : so I can then split each line into an array based on \s
3. Loop through the array adding the line title to each value into a Hash

这是我到目前为止所做的:

Pattern match_patternSwallow2 = Pattern.compile("(?:.*\\d+\\.\\d|N\\/A|\\d*){4,50}");
Matcher matchermatch_patternSwallow2 = match_patternSwallow2.matcher(s);

while (matchermatch_patternSwallow2.find()){
    String found = matchermatch_patternSwallow2.group(0).trim();
    System.out.println(found);

    //Join up the words so can then split by space
    found = found.replaceAll("([A-Za-z]+)\\s", "$1_").replaceAll("\\s", ":");
    List<String> myList = new ArrayList<String>(Arrays.asList(found.split(":")));

    for (int ff=1;ff<myList.size();ff++){
        mapSwallow.put(myList.get(0)+"MapSwallowsNum"+ff,myList.get(ff));
    }
}

捕获时没有错误,但只返回System.out行的空字符串。

我做错了什么?

1 个答案:

答案 0 :(得分:1)

我可以建议使用以下正则表达式来获得符合条件的每一行:

"(?m)^\\W*([a-zA-Z].*?)\\s*((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)‌​*)$"

请参阅regex demo

<强>详情:

  • (?m) -
  • 上的多线模式
  • ^ - 开始行
  • \\W* - 0+非单词字符
  • ([a-zA-Z].*?) - (第1组)一封信,其后跟除了换行符之外的任何0 +字符,尽可能少
  • \\s* - 零个或多个空格
  • ((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)‌​*) - 第2组捕获0+个数字序列(后跟点和数字可选)或N/A后跟0 +空格
  • $ - 行尾。

找到匹配项后,请使用.group(1).replaceAll("\\s+","")作为键,然后将.group(2).split("\\s+")分开以获取值。

查看示例在线代码:

String s = "Distal latency   4.9 N/A N/A 4.0 N/A N/A N/A N/A 6.3 4.4 N/A\n\n % failed Chicago Classification  70 1 1 0 1 1 1 1 0 0 1\n\n % panesophageal pressurization  0 0 0 0 0 0 0 0 0 0 0\n\n % premature contraction  20 0 0 1 0 0 0 0 0 1 0\n\n % rapid contraction  10 0 0 1 0 0 0 0 0 0 0\n\n % large breaks  10 0 0 0 0 0 0 0 1 0 0\n\n % small breaks  10 0 0 1 0 0 0 0 0 0 0";
Pattern match_patternSwallow2= Pattern.compile("(?m)^\\W*([a-zA-Z].*?)\\s*((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)*)$");
Matcher matchermatch_patternSwallow2 = match_patternSwallow2.matcher(s);
HashMap<String, String> mapSwallow = new HashMap<String, String>();
while (matchermatch_patternSwallow2.find()){
    String[] myList = matchermatch_patternSwallow2.group(2).split("\\s+");
    String p1 = matchermatch_patternSwallow2.group(1).replaceAll("\\s+", "");
    int line = 1;
    for (String p2s: myList){
        mapSwallow.put(p1+line, p2s);
        line++;
    }
}
System.out.println(mapSwallow);