String original = "This is a sentence.Rajesh want to test the application for the word split.";
List matchList = new ArrayList();
Pattern regex = Pattern.compile(".{1,10}(?:\\s|$)", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(original);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
System.out.println("Match List "+matchList);
我需要将文本解析为长度不超过10个字符的行数组,并且不应该在行尾添加单词。
我在我的场景中使用了以下逻辑,但问题是如果在行尾有一个中断,它会在10个字符后解析到最近的空格
例如:实际的句子是“这是一个句子.Rajesh想要测试应用程序的单词split。”但是在逻辑执行后它得到如下。
匹配列表[这是一个,nce.Rajesh,想要测试,应用,对于,单词,拆分。]
答案 0 :(得分:4)
好的,所以我设法让以下工作,最大行长度为10,但也正确地分割超过10的单词!
String original = "This is a sentence. Rajesh want to test the applications for the word split handling.";
List matchList = new ArrayList();
Pattern regex = Pattern.compile("(.{1,10}(?:\\s|$))|(.{0,10})", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(original);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
System.out.println("Match List "+matchList);
结果如下:
This is a
sentence.
Rajesh want
to test
the
applicatio
ns word
split
handling.
答案 1 :(得分:2)
此问题在某些时候被标记为Groovy。假设Groovy的答案仍然有效且您不担心保留多个空格(例如''):
def splitIntoLines(text, maxLineSize) {
def words = text.split(/\s+/)
def lines = ['']
words.each { word ->
def lastLine = (lines[-1] + ' ' + word).trim()
if (lastLine.size() <= maxLineSize)
// Change last line.
lines[-1] = lastLine
else
// Add word as new line.
lines << word
}
lines
}
// Tests...
def original = "This is a sentence. Rajesh want to test the application for the word split."
assert splitIntoLines(original, 10) == [
"This is a",
"sentence.",
"Rajesh",
"want to",
"test the",
"application",
"for the",
"word",
"split."
]
assert splitIntoLines(original, 20) == [
"This is a sentence.",
"Rajesh want to test",
"the application for",
"the word split."
]
assert splitIntoLines(original, original.size()) == [original]
答案 2 :(得分:1)
我避免使用正则表达式,因为它不会减轻重量。这段代码是自动换行的,如果一个单词超过10个字符,就会破坏它。它还会处理多余的空白。
import static java.lang.Character.isWhitespace;
public static void main(String[] args) {
final String original =
"This is a sentence.Rajesh want to test the application for the word split.";
final StringBuilder b = new StringBuilder(original.trim());
final List<String> matchList = new ArrayList<String>();
while (true) {
b.delete(0, indexOfFirstNonWsChar(b));
if (b.length() == 0) break;
final int splitAt = lastIndexOfWsBeforeIndex(b, 10);
matchList.add(b.substring(0, splitAt).trim());
b.delete(0, splitAt);
}
System.out.println("Match List "+matchList);
}
static int lastIndexOfWsBeforeIndex(CharSequence s, int i) {
if (s.length() <= i) return s.length();
for (int j = i; j > 0; j--) if (isWhitespace(s.charAt(j-1))) return j;
return i;
}
static int indexOfFirstNonWsChar(CharSequence s) {
for (int i = 0; i < s.length(); i++) if (!isWhitespace(s.charAt(i))) return i;
return s.length();
}
打印:
Match List [This is a, sentence.R, ajesh, want to, test the, applicatio, n for the, word, split.]