这是检查replacement string是否有效的实用方法:
public static boolean isValidReplacementString(String regex, String replacement) {
try {
"".replaceFirst(regex, replacement);
return true;
} catch (IllegalArgumentException | NullPointerException e) {
return false;
}
}
我想在执行真正的替换之前检查一下,因为获取源字符串很昂贵(I / O)。
我发现这个解决方案非常hacky。标准库中是否已存在我遗漏的方法?
修改 As pointed out by sln,如果找不到匹配项,这甚至都不起作用。
修改 Following shmosel's answer,我提出了这个“解决方案”:
private static boolean isLower(char c) {
return c >= 'a' && c <= 'z';
}
private static boolean isUpper(char c) {
return c >= 'A' && c <= 'Z';
}
private static boolean isDigit(char c) {
return isDigit(c - '0');
}
private static boolean isDigit(int c) {
return c >= 0 && c <= 9;
}
@SuppressWarnings("unchecked")
public static void checkRegexAndReplacement(String regex, String replacement) {
Pattern parentPattern = Pattern.compile(regex);
Map<String, Integer> namedGroups;
int capturingGroupCount;
try {
Field namedGroupsField = Pattern.class.getDeclaredField("namedGroups");
namedGroupsField.setAccessible(true);
namedGroups = (Map<String, Integer>) namedGroupsField.get(parentPattern);
Field capturingGroupCountField = Pattern.class.getDeclaredField("capturingGroupCount");
capturingGroupCountField.setAccessible(true);
capturingGroupCount = capturingGroupCountField.getInt(parentPattern);
} catch (NoSuchFieldException | IllegalAccessException e) {
throw new RuntimeException("That's what you get for using reflection!", e);
}
int groupCount = capturingGroupCount - 1;
// Process substitution string to replace group references with groups
int cursor = 0;
while (cursor < replacement.length()) {
char nextChar = replacement.charAt(cursor);
if (nextChar == '\\') {
cursor++;
if (cursor == replacement.length())
throw new IllegalArgumentException(
"character to be escaped is missing");
nextChar = replacement.charAt(cursor);
cursor++;
} else if (nextChar == '$') {
// Skip past $
cursor++;
// Throw IAE if this "$" is the last character in replacement
if (cursor == replacement.length())
throw new IllegalArgumentException(
"Illegal group reference: group index is missing");
nextChar = replacement.charAt(cursor);
int refNum = -1;
if (nextChar == '{') {
cursor++;
StringBuilder gsb = new StringBuilder();
while (cursor < replacement.length()) {
nextChar = replacement.charAt(cursor);
if (isLower(nextChar) ||
isUpper(nextChar) ||
isDigit(nextChar)) {
gsb.append(nextChar);
cursor++;
} else {
break;
}
}
if (gsb.length() == 0)
throw new IllegalArgumentException(
"named capturing group has 0 length name");
if (nextChar != '}')
throw new IllegalArgumentException(
"named capturing group is missing trailing '}'");
String gname = gsb.toString();
if (isDigit(gname.charAt(0)))
throw new IllegalArgumentException(
"capturing group name {" + gname +
"} starts with digit character");
if (namedGroups == null || !namedGroups.containsKey(gname))
throw new IllegalArgumentException(
"No group with name {" + gname + "}");
refNum = namedGroups.get(gname);
cursor++;
} else {
// The first number is always a group
refNum = (int)nextChar - '0';
if (!isDigit(refNum))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
// Capture the largest legal group string
boolean done = false;
while (!done) {
if (cursor >= replacement.length()) {
break;
}
int nextDigit = replacement.charAt(cursor) - '0';
if (!isDigit(nextDigit)) { // not a number
break;
}
int newRefNum = (refNum * 10) + nextDigit;
if (groupCount < newRefNum) {
done = true;
} else {
refNum = newRefNum;
cursor++;
}
}
}
if (refNum < 0 || refNum > groupCount) {
throw new IndexOutOfBoundsException("No group " + refNum);
}
} else {
cursor++;
}
}
}
如果抛出此方法,则正则表达式或替换字符串无效。
这比replaceAll
或replaceFirst
更严格,因为如果找不到匹配项,这些方法将不会调用appendReplacement
,因此“缺少”无效的组引用。
答案 0 :(得分:1)
我说你最好的办法是复制Matcher.appendReplacement()
中实现的流程,删除与源字符串或结果字符串相关的任何逻辑。这不可避免地意味着您无法进行某些验证,例如验证组名和索引,但您应该能够应用其中的大部分。