Java在索引处拆分字符串而不切开单词

时间:2018-09-18 19:18:34

标签: java arrays string

我只是想知道这是一种API或某种简单快捷的方法,可将给定索引处的String拆分为String[] array,但是如果该索引处有一个单词,则将其放入到其他字符串。

所以可以说我有一个字符串:"I often used to look out of the window, but I rarely do that anymore"

该字符串的长度为 68 ,我必须将其剪切为 36 ,在给定的句子 n 中,但是现在它应该在 the 处拆分单词,以便数组为["I often used to look out of the", "window, but I rarely do that anymore"]

如果新句子的长度大于 36 ,则也应将其拆分,因此,如果我的句子较长:"I often used to look out of the window, but I rarely do that anymore, even though I liked it"
将是["I often used to look out of the", "window, but I rarely do that anymore", ",even though I liked it"]

2 个答案:

答案 0 :(得分:1)

This matches between 1 and 30 characters repetitively (greedy) and requires a whitespace behind each match.

public static List<String> chunk(String s, int size) {
    List<String> chunks = new ArrayList<>(s.length()/size+1);
    Pattern pattern = Pattern.compile(".{1," + size + "}(=?\\s|$)");
    Matcher matcher = pattern.matcher(s);
    while (matcher.find()) {
        chunks.add(matcher.group());
    }
    return chunks;
}

Note that it doesn't work if there's a long string (>size) whitout whitespace.

答案 1 :(得分:1)

Here's an old-fashioned, non-stream, non-regex solution:

public static List<String> chunk(String s, int limit) 
{
    List<String> parts = new ArrayList<String>();
    while(s.length() > limit)
    {
        int splitAt = limit-1;
        for(;splitAt>0 && !Character.isWhitespace(s.charAt(splitAt)); splitAt--);           
        if(splitAt == 0) 
            return parts; // can't be split
        parts.add(s.substring(0, splitAt));
        s = s.substring(splitAt+1);
    }
    parts.add(s);
    return parts;
}

This doesn't trim additional spaces either side of the split point. Also, if a string cannot be split, because it doesn't contain any whitespace in the first limit characters, then it gives up and returns the partial result.

Test:

public static void main(String[] args)
{
    String[] tests = {
            "This is a short string",
            "This sentence has a space at chr 36 so is a good test",
            "I often used to look out of the window, but I rarely do that anymore, even though I liked it",
            "I live in Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch",
    };

    int limit = 36;
    for(String s : tests)
    {
        List<String> chunks = chunk(s, limit);
        for(String st : chunks)
            System.out.println("|" + st + "|");
        System.out.println();
    }
}

Output:

|This is a short string|

|This sentence has a space at chr 36|
|so is a good test|

|I often used to look out of the|
|window, but I rarely do that|
|anymore, even though I liked it|

|I live in|