我想将一些长句分成固定长度的块。到目前为止,我使用番石榴:
Splitter.fixedLength(20).split(string);
很好,但我怎样才能防止单词之间的分裂?我的目标是分割为最多20个字符,但如果分割点不是空格则更少。
答案 0 :(得分:3)
我发现org.apache.commons.lang3.text.WordUtils.wrap()
正是我要求的。
答案 1 :(得分:2)
我会在白色空间分裂,然后组合可以组合的单词。
String[] arr = str.split("\\s+"); //get arr of strings by whitespace
List<String> split = new ArrayList<>(); //final list of tokens
for(int i =0; i<arr.length-1; i++){ //for all but the last word
String s = arr[i];
int len = s.length();
String newString = s;
while(len < 20){ //keep adding to the word until there are 20 chars
if(len+arr[i+1].length()<19){ //if 2 words + space <20...
newString+=" "+arr[i+1]; //add the two words plus a space
len = newString.length(); //sets the value of len to the current string length
i++; //skip that word, its been added!
}
}
split.add(newString); //add either original word, or combined word.
}
return split;
答案 2 :(得分:2)
也许
Matcher m = Pattern.compile("(?s)(.{1,19}(\\s|$)|\\S{20}|\\S+$)").matcher(s);
while (m.find()) {
String part = m.group(1);
...
}
正则表达式:
(
.{1,19}(\\s|$) upto 19 chars with space at end or end-of-string
could use word boundary \\b
|
\\S{20} 20 non-chars
|
\\S+$ at the end
)