Java编译正则表达式错误

时间:2015-04-09 14:36:56

标签: java regex

我大量使用正则表达式将业务参数存储在各种表中(主要是业务决策树逻辑)。当成千上万的业务对象试图将自己与这些正则表达式驱动的参数匹配时,在各种属性上使用String.matches()可能会非常慢。所以我创建了一个名为MatchRegex的类,它作为regex String的属性类型,并在内部编译正则表达式并重置测试输入字符串。

public final class MatchRegex {
    private final String regex;
    private final Pattern pattern;
    private final Matcher matcher;

    private MatchRegex(String regex) { 
        this.regex = regex;
        this.pattern = Pattern.compile(regex);
        this.matcher = pattern.matcher("Hello");
    }
    public static MatchRegex of(String regex) { 
        return new MatchRegex(regex);
    }
    public boolean matches(String input) { 
        return matcher.reset(input).matches();
    }
    public String getRegex() { 
        return regex;
    }
}

然而,我有点不安,我随机得到一个对我来说没什么意义的错误,除非我深入研究Pattern源代码。它在return matcher.reset(input).matches()行失败。这是正则表达式库的错误吗?我该如何解决?

java.lang.StringIndexOutOfBoundsException: String index out of range: 7
    at java.lang.String.charAt(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$GroupTail.match(Unknown Source)
    at java.util.regex.Pattern$BranchConn.match(Unknown Source)
    at java.util.regex.Pattern$Slice.match(Unknown Source)
    at java.util.regex.Pattern$Branch.match(Unknown Source)
    at java.util.regex.Pattern$GroupHead.match(Unknown Source)
    at java.util.regex.Pattern$GroupTail.match(Unknown Source)
    at java.util.regex.Pattern$BranchConn.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$Branch.match(Unknown Source)
    at java.util.regex.Pattern$GroupHead.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Matcher.match(Unknown Source)
    at java.util.regex.Matcher.matches(Unknown Source)

1 个答案:

答案 0 :(得分:1)

与Jon Skeet同时意识到Matcher不是线程安全的。我需要使用一些线程本地化或同步。希望它的性能不会太高。

更新 我猜最有效的策略是每次只调用一个新的Matcher。

public final class MatchRegex {
    private final String regex;
    private final Pattern pattern;

    private MatchRegex(String regex) { 
        this.regex = regex;
        this.pattern = Pattern.compile(regex);
    }
    public static MatchRegex of(String regex) { 
        return new MatchRegex(regex);
    }
    public boolean matches(String input) { 
        return pattern.matcher(input).matches();
    }
    public String getRegex() { 
        return regex;
    }
}