我有一个字符串There is a boy's puppy. Really?
。我需要找到外部词句,并将其从附加词中分离出来,然后将其视为另一个词。输出为:
boy's
是一个字(内部标点符号)puppy.
是两个词,puppy
和.
Really?
是两个词,Really
和?
我已在代码中根据外部标点将单词拆分,但我希望将它们作为单独的单词。
String[] Res = word.split("[\\p{Punct}\\s]+");
我该怎么做?
答案 0 :(得分:1)
What you want to do with your reg ex is using a non-capturing group so that it becomes part of the output, so in the reg ex I have two groups separated by an OR (|
) where the first is capturing and the second one is non-capturing. I am not sure I've included all external punctuation you wanted in my non-capturing group, (?=X)
.
String word = "There is a boy's puppy. Really?";
String[] res = word.split("(\\s+)|(?=[\\.\\?])");
for (String s: res ) {
System.out.print("[" + s + "]");
}
Output is
[There][is][a][boy's][puppy][.][Really][?]