文本在指定长度后分割但不会使用grails中断单词

时间:2012-03-12 04:39:23

标签: java regex groovy

我有一个长字符串,我需要解析为长度不超过50个字符的字符串数组。对我来说,这个棘手的部分是确保正则表达式在50个字符之前找到最后一个空格,以便在字符串之间进行干净的中断,因为我不想切断单词。

public List<String> splitInfoText(String msg) { 
     int MAX_WIDTH = 50; 
     def line = [] String[] words; 
     msg = msg.trim(); 
     words = msg.split(" "); 
     StringBuffer s = new StringBuffer(); 
     words.each {
        word -> s.append(word + " "); 
        if (s.length() > MAX_WIDTH) { 
          s.replace(s.length() - word.length()-1, s.length(), " "); 
          line << s.toString().trim();
          s = new StringBuffer(word + " "); 
        } 
     } 
     if (s.length() > 0) 
        line << s.toString().trim();
     return line; 
}

2 个答案:

答案 0 :(得分:6)

试试这个:

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(".{1,50}(?:\\s|$)", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group());
}

答案 1 :(得分:4)

我相信Tim的答案的Groovier版本是:

List matchList = ( subjectString =~ /(?s)(.{1,50})(?:\s|$)/ ).collect { it[ 1 ] }