我的文字看起来像这样
| birth_date = {{birth date|1925|09|2|df=y}}
| birth_place = [[Bristol]], [[England]], UK
| death_date = {{death date and age|2000|11|16|1925|09|02|df=y}}
| death_place = [[Eastbourne]], [[Sussex]], England, UK
| origin =
| instrument = [[Piano]]
| genre =
| occupation = [[Musician]]
我想获得[[]]内的所有内容。我尝试使用replace all来替换[[]]内部的所有内容,然后使用split by new line来获取带[[]]的文本列表。
input = input.replaceAll("^[\\[\\[(.+)\\]\\]]", "");
必需的输出:
[[Bristol]]
[[England]]
[[Eastbourne]]
[[Sussex]]
[[Piano]]
[[Musician]]
但这并没有给出理想的输出。我在这里失踪了什么?有成千上万的文件,这是获得它的最快方法吗?如果不是,请告诉我获得所需输出的最佳方法。
答案 0 :(得分:6)
你需要匹配它而不是替换
Matcher m=Pattern.compile("\\[\\[\\w+\\]\\]").matcher(input);
while(m.find())
{
m.group();//result
}
答案 1 :(得分:2)
使用Matcher.find
。例如:
import java.util.regex.*;
...
String text =
"| birth_date = {{birth date|1925|09|2|df=y}}\n" +
"| birth_place = [[Bristol]], [[England]], UK\n" +
"| death_date = {{death date and age|2000|11|16|1925|09|02|df=y}}\n" +
"| death_place = [[Eastbourne]], [[Sussex]], England, UK\n" +
"| origin = \n" +
"| instrument = [[Piano]]\n" +
"| genre = \n" +
"| occupation = [[Musician]]\n";
Pattern pattern = Pattern.compile("\\[\\[.+?\\]\\]");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
}
答案 2 :(得分:0)
只是为了好玩,使用replaceAll
:
String output = input.replaceAll("(?s)(\\]\\]|^).*?(\\[\\[|$)", "$1\n$2");