在两个示例中,我都希望替换“名称”节点的值。我使用正则表达式组进行匹配并替换。分组有效,但替换无效。
input 1
<xml
<user:address>.../</user:address>
<user:name>foo</user:name>
</xml>
input 2
<xml
<user:address>.../</user:address>
<street:name>bar</street:name>
</xml>
private static final String NAME_GROUP = "name";
public static final Pattern pattern = Pattern.compile("<.*:name>" + "(?<" + NAME + ">.*)</.*:name>");
final Matcher nameMatcher = pattern.matcher(str);
final String s = nameMatcher.find() ? nameMatcher.group(NAME_GROUP) : null;
System.out.println(s);
//foo
//bar
现在当我替换
String output = nameMatcher.replaceFirst("hello")
I get
hello</xml>
虽然我期望以下
<xml
<user:address>.../</user:address>
<user:name>hello</user:name>
</xml>
对于两个示例。为什么组可以工作但不能替代?
答案 0 :(得分:2)
假设这只是一个示例,并且您没有尝试使用正则表达式解析XML,则可以使用这种方法。在这里,我们在单独的捕获组中匹配并捕获字符串前和字符串后。在替换中,我们使用这些组的反向引用将字符串前和字符串后放回最终输出中。
TansformProcess
请注意,对于这种特定情况,可以使用以下较短的代码:
final String str = "<xml\n" +
" <name>bar</name>\n" +
" <user:address>.../</user:address>\n" +
" <user:name>foo</user:name>\n" +
"</xml>";
final String NAME_GROUP = "name";
final Pattern pattern = Pattern.compile("(<(?:[^:]+:)?name>)(?<" + NAME_GROUP + ">.*?)(</(?:[^:]+:)?name>)");
final Matcher m = pattern.matcher(str);
StringBuilder sb = new StringBuilder();
while (m.find()) {
m.appendReplacement( sb, m.group(1) + "hello" + m.group(3) );
}
m.appendTail(sb);
System.out.println(sb);
输出:
final Pattern pattern = Pattern.compile("(<(?:[^:]+:)?name>)>.*?(</(?:[^:]+:)?name>)");
final Matcher m = pattern.matcher(str);
String repl = m.replaceAll("$1hello$2");
System.out.println(repl);
答案 1 :(得分:1)
我的猜测是,在这里我们想用一些新名称替换name元素。一种方法是,我们创建三个捕获组,一个作为打开标记的左边界,一个作为我们要替换的所需输出的标记,而第三个作为结束标记的标记:
(<.+?:name>)(.+?)(<\/.+?:name>)
如果不需要此表达式,可以在regex101.com中对其进行修改或更改。
jex.im还有助于可视化表达式。
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(<.+?:name>)(.+?)(<\\/.+?:name>)";
final String string = "<xml\n"
+ " <user:address>.../</user:address>\n"
+ " <user:name>foo</user:name>\n"
+ "</xml>\n"
+ "<xml\n"
+ " <user:address>.../</user:address>\n"
+ " <street:name>bar</street:name>\n"
+ "</xml>\n"
+ "<xml\n"
+ " <user:address>.../</user:address>\n"
+ " <user:name>hello</user:name>\n"
+ " </xml>";
final String subst = "\\1Any New Name You Wish Goes Here\\3";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
编辑:
如果我们希望拥有<name></name>
标签,则可以更新表达式并将标签的第一部分设为可选:
(<(.+?:)?name>)(.+?)(<\/(.+?:)?name>)
答案 2 :(得分:1)
replaceFirst
和replaceAll
中的操作String
/ Matcher
将始终替换整个匹配项。他们归结为类似
public static String replace(
CharSequence source, Pattern p, String replacement, boolean all) {
Matcher m = p.matcher(source);
if(!m.find()) return source.toString();
StringBuffer sb = new StringBuffer();
do m.appendReplacement(sb, replacement); while(all && m.find());
return m.appendTail(sb).toString();
}
请注意,在Java 9之前,我们必须在这里使用StringBuffer
而不是StringBuilder
。
当我们忽略在替换字符串中包含组引用的功能时,我们可以在逻辑中更深入一层并获取
public static String replaceLiteral(
CharSequence source, Pattern p, String replacement, boolean all) {
Matcher m = p.matcher(source);
if(!m.find()) return source.toString();
StringBuilder sb = new StringBuilder();
int lastEnd = 0;
do {
sb.append(source, lastEnd, m.start()).append(replacement);
lastEnd = m.end();
} while(all && m.find());
return sb.append(source, lastEnd, source.length()).toString();
}
对于此代码,更改逻辑以替换特定的命名组而不是整个匹配很容易:
public static String replaceGroupWithLiteral(
CharSequence source, Pattern p, String groupName, String replacement, boolean all) {
Matcher m = p.matcher(source);
if(!m.find()) return source.toString();
StringBuilder sb = new StringBuilder();
int lastEnd = 0;
do {
sb.append(source, lastEnd, m.start(groupName)).append(replacement);
lastEnd = m.end(groupName);
} while(all && m.find());
return sb.append(source, lastEnd, source.length()).toString();
}
这已经足以实现您的示例:
private static final String NAME_GROUP = "name";
public static final Pattern pattern
= Pattern.compile("<.*:name>" + "(?<" + NAME_GROUP + ">.*)</.*:name>");
String input =
"<xml\n"
+ " <user:address>.../</user:address>\n"
+ " <user:name>foo</user:name>\n"
+ "</xml>\n";
String s = replaceGroupWithLiteral(input, pattern, NAME_GROUP, "hello", false);
System.out.println(s);
<xml
<user:address>.../</user:address>
<user:name>hello</user:name>
</xml>
尽管我可能会使用
public static final Pattern pattern
= Pattern.compile("<([^<>:]*?:name)>" + "(?<" + NAME_GROUP + ">.*)</\\1>");
如上所述(并通过方法名称明确指出),这与普通的regex替换操作不同,因为它将始终按字面意义插入替换。要获得与原型相同的行为,就需要更复杂,效率更低的代码,因此,仅在确实需要引用组时才使用它(否则该语法应被视为合同的替代语法)。
public static String replaceGroup(
CharSequence source, Pattern p, String groupName, String replacement, boolean all) {
Matcher m = p.matcher(source);
if(!m.find()) return source.toString();
StringBuffer sb = new StringBuffer();
do {
int s = m.start(), gs = m.start(groupName), e = m.end(), ge = m.end(groupName);
String prefix = s == gs? "":
Matcher.quoteReplacement(source.subSequence(s, gs).toString());
String suffix = e == ge? "":
Matcher.quoteReplacement(source.subSequence(ge, e).toString());
m.appendReplacement(sb, prefix+replacement+suffix);
} while(all && m.find());
return m.appendTail(sb).toString();
}
以此为例,如果我们使用
String s = replaceGroup(input, pattern, NAME_GROUP, "[[${"+NAME_GROUP+"}]]", false);
我们得到
<xml
<user:address>.../</user:address>
<user:name>[[foo]]</user:name>
</xml>