我有以下字符串“3 / 4Ton”。我想将其拆分为 - >
word [1] = 3/4,word [2] = Ton。
现在我的代码看起来像这样: -
Pattern p = Pattern.compile("[A-Z]{1}[a-z]+");
Matcher m = p.matcher(line);
while(m.find()){
System.out.println("The word --> "+m.group());
}
它执行基于大写字母分割字符串所需的任务,如: -
String = MachineryInput
字[1] =机械,字[2] =输入
唯一的问题是它不保留,数字或缩写或大写字母序列不是单独的单词。有人可以帮我解决正则表达式编码问题。
提前致谢...
答案 0 :(得分:4)
实际上,你可以单独使用正则表达式进行预测,然后向后看 (请参阅此页面上的特殊构造:http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html)
/**
* We'll use this pattern as divider to split the string into an array.
* Usage: myString.split(DIVIDER_PATTERN);
*/
private static final String DIVIDER_PATTERN =
"(?<=[^\\p{Lu}])(?=\\p{Lu})"
// either there is anything that is not an uppercase character
// followed by an uppercase character
+ "|(?<=[\\p{Ll}])(?=\\d)"
// or there is a lowercase character followed by a digit
;
@Test
public void testStringSplitting() {
assertEquals(2, "3/4Word".split(DIVIDER_PATTERN).length);
assertEquals(7, "ManyManyWordsInThisBigThing".split(DIVIDER_PATTERN).length);
assertEquals(7, "This123/4Mixed567ThingIsDifficult"
.split(DIVIDER_PATTERN).length);
}
所以你能做的就是这样:
for(String word: myString.split(DIVIDER_PATTERN)){
System.out.println(word);
}
肖恩
答案 1 :(得分:2)
在这里使用正则表达式会很好。我敢打赌有一种方法可以做到这一点,虽然我不是一个摇摆在葡萄藤上的正则表达式家伙,所以我无法帮助你。然而,有些东西是你无法避免的 - 某些东西,最终需要在你的String上循环。你可以“自己”这样做:
String[] splitOnCapitals(String str) {
ArrayList<String> array = new ArrayList<String>();
StringBuilder builder = new StringBuilder();
int min = 0;
int max = 0;
for(int i = 0; i < str.length(); i++) {
if(Character.isUpperCase(str.charAt(i))) {
String line = builder.toString().trim();
if (line.length() > 0) array.add(line);
builder = new StringBuilder();
}
builder.append(str.charAt(i));
}
array.add(builder.toString().trim()); // get the last little bit too
return array.toArray(new String[0]);
}
我使用以下测试驱动程序对其进行了测试:
public static void main(String[] args) {
String test = "3/4 Ton truCk";
String[] arr = splitOnCapitals(test);
for(String s : arr) System.out.println(s);
test = "Start with Capital";
arr = splitOnCapitals(test);
for(String s : arr) System.out.println(s);
}
得到以下输出:
3/4
Ton tru
Ck
Start with
Capital