我发现的所有例子都是人们有一些正则表达式的情况 搜索并需要替换找到的具有某些特定值的所有组,或者搜索到的字符串中已知数量的组。
但在我的情况下,我需要根据找到的值更改每个组,如何更改每个更改的结果值?
这就是我所拥有/尝试过的:
Pattern pattern = Pattern.compile(DEFINITION_WITH_OR);
Matcher matcher = pattern.matcher(s);
StringBuffer sb = new StringBuffer();
while (matcher.find()){
String ss = matcher.group();
/*Some string manupilation*/
// matcher.appendReplacement(sb, bestMatchedDefinition);
// matcher.appendReplacement(sb,Matcher.quoteReplacement(ss));
// s = s.replace(s.substring(matcher.start(),matcher.end()),ss);
}
我希望做的是遍历找到的所有群组,对找到的群组执行某些操作,并仅编辑该群组,内容和内容。在运行之前不知道组的数量。
到目前为止,我的所有尝试都改变了一切或根本没改变,有什么建议吗?
我对字符串的处理是由|
拆分,得到最短的部分,然后删除括号:
示例输入字符串:
注意:以下输入字符串是一个简化,以显示我的最终结果应该是什么,完整的字符串有更多烦人的字符我使用DEFINITION_WITH_OR
模式清除
a commissioned general officer in the United States Army,
[[United States Marine Corps|Marine Corps]],
or [[United States Air Force|Air Force]] superior to a lieutenant general.
A general is equal in rank or grade to a four star admiral. In the US Army,
a general is junior to a general of the army. In the US Marine Corps,
a general is the highest rank of commissioned officer. In the US Air Force,
a general is junior to a general of the air force.
应输出为:
a commissioned general officer in the United States Army,
Marine Corps,
or Air Force superior to a lieutenant general.
A general is equal in rank or grade to a four star admiral. In the US Army,
a general is junior to a general of the army. In the US Marine Corps,
a general is the highest rank of commissioned officer. In the US Air Force,
a general is junior to a general of the air force.
请注意空军和海军陆战队位。
答案 0 :(得分:1)
String source = "a commissioned general officer in the United States Army, "
+ "[[United States Marine Corps|Marine Corps]], "
+ "or [[United States Air Force|Air Force]] superior to a lieutenant general.";
Pattern pattern = Pattern.compile("\\[\\[(.*?)\\]\\]");
Matcher m = pattern.matcher(source);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String[] terms = m.group(1).split("\\|");
String shortestTerm = null;
for (String term : terms) {
if (shortestTerm == null || term.length() < shortestTerm.length()) {
shortestTerm = term;
}
}
m.appendReplacement(sb, shortestTerm);
}
m.appendTail(sb);
String target = sb.toString();
System.out.println(target);
请注意虚假的反斜杠。 ".*?"
采用最短的序列匹配。
答案 1 :(得分:0)
好吧,多亏了Joop的回答,我意识到我没有添加以下代码:
matcher.appendTail(sb);
s = sb.toString();
在while循环之后,行matcher.appendReplacement(sb,Matcher.quoteReplacement(ss));
确实做到了。
出于某种原因,matcher.appendReplacement(sb,ss);
也起作用,但速度要慢得多。如果有人知道为什么并且可以发表评论那就太好了。