Java正则表达式识别对Github中的错误的引用

时间:2016-12-28 23:39:24

标签: java regex github

我需要捕获Github中提交消息引用的所有错误号。

错误号是一个整数,引用以修复 / 修正 / 修复 / 修复 /开头关闭 / 关闭 / 关闭 / 关闭 / 解决 / 解析< / strong> / 已解决 / 解析,然后是 #XYZ ,其中XYZ是错误编号。

这是一个例子,我尝试过:

String commitMessage = "This fixes #23 fixed#24 fix #25, #26 resolves #27 #28#29 resolved#30 #31 ,  #32. Also see #33";
String regex = "clos(e|es|ed|ing) ?#[0-9]+" 
        + "|fix(es|ed|ing)? ?#[0-9]+" 
        + "|resolv(e|es|ed|ing) ?#[0-9]+";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(commitMessage);
while (m.find()){
    System.out.println(m.group(0));
}

,输出为:

fixes #23
fixed #24
fix #25
resolves #27
resolves#30

但我需要它:

fixes #23
fixed #24
fix #25, #26
resolves #27 #28#29
resolved#30 #31 ,  #32

注意,引用可以是单个错误(例如,#23)或同时存在多个错误(例如,#25,#26)。

另请注意,在引用多个错误时,不同错误号之间可能会有一个或多个空格和/或逗号。

3 个答案:

答案 0 :(得分:2)

您可以在[\s\p{P}]*之前将#添加到正则表达式以匹配空格或标点符号,0或更多次出现,并且您可以略微收缩模式:

String regex = "(?:(?:clos|resolv)(?:e|es|ed|ing)|fix(?:es|ed|ing)?)(?:[\\s\\p{P}]*#[0-9]+)+";

主要区别是(?:[\\s\\p{P}]*#[0-9]+)+匹配1次或多次:

  • [\\s\\p{P}]* - 0+空格或标点字符
  • # - 哈希符号
  • [0-9]+ - 一位或多位数。

请参阅Java demo

String commitMessage = "This fixes #23 fixed#24 fix #25, #26 resolves #27 #28#29 resolved#30 #31 ,  #32. Also see #33";
String regex = "(?:(?:clos|resolv)(?:e|es|ed|ing)|fix(?:es|ed|ing)?)(?:[\\s\\p{P}]*#[0-9]+)+";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(commitMessage);
while (m.find()){
    System.out.println(m.group(0));
}

输出:

fixes #23
fixed#24
fix #25, #26
resolves #27 #28#29
resolved#30 #31 ,  #32

答案 1 :(得分:1)

您可以使用以下正则表达式:

clos(e|es|ed|ing)([ ,]*#[0-9]+)+ ?|fix(es|ed|ing)?([ ,]*#[0-9]+)+ ?|resolv(e|es|ed|ing)([ ,]*#[0-9]+)+ ?

这是一个有效的例子:
https://regex101.com/r/In7cox/1

答案 2 :(得分:1)

我会使用两组正则表达式(以及两个while循环)。我还会使用命名捕获组来使代码更易读和更易于维护:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class GitHubBugTrackingRegex {

    public static void main(String[] args) {

        String commitMessage = "This fixes #23 fixed#24 fix #25, #26 "
                + "resolves #27 #28#29 resolved#30 #31 ,  #32. Also see #33";
        String regexBugReference    = "(?<oneBug>#\\d+)"; 
        String regexBugReferences   = "(?<someBugs>(\\s*,*\\s*" + regexBugReference + "\\s*)+)"; 
        String regex = 
                "(?<oneCase>(?<resolution>clos(e|es|ed|ing)|fix(|es|ed|ing)|resolv(e|es|ed|ing))"   
                        + regexBugReferences
                        + ")";
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(commitMessage);
        while (m.find()){
            String resolution   = m.group("resolution");
            String someBugs     = m.group("someBugs");
            Pattern p2 = Pattern.compile(regexBugReference);
            Matcher m2 = p2.matcher(someBugs);
            StringBuilder sb = new StringBuilder();
            String comma = "";      // first time special
            while (m2.find()) {
                String oneBug = m2.group("oneBug");
                sb.append(comma + oneBug);
                comma = ", ";       // second time and onwards
            }
            System.out.format("%8s %s%n", resolution, sb.toString());
        }

    }

}

此代码的输出为:

   fixes #23
   fixed #24
     fix #25, #26
resolves #27, #28, #29
resolved #30, #31, #32