Java字符串分段位于第n个位置

时间:2018-07-15 04:11:59

标签: java

我在Java中的代码和一个长文本(最多500个字符),我想对此文本进行某种细分,在每个细分中我只希望有6个字符 例如: 这是一个示例文本:

String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";

我想要这个结果:

  

segment1:叙利亚

     

segment2:offici

     

Segment3:同盟k

     

段n:……

我尝试了for循环,但没有达到目标。.而且我遇到了错误

java.lang.StringIndexOutOfBoundsException: length=67; regionStart=65; regionLength=5

这是我的代码:

    String msg = fullText;

for(int i=-1 ; i <= fullText.length()+1; i++){
            
     int len = msg.length();
     text = new StringBuilder().append(msgInfo).append(msg.substring(i, i + 6)).toString();
     
     msg = new StringBuilder().append(msg.substring(i +5, len)).toString();

     LogHelper.d(TAG, "teeeeeeeeeeeeext:"+i +" .."+ text);

        }

我如何正确进行此细分? 谢谢!

4 个答案:

答案 0 :(得分:2)

您处在正确的轨道上,但是您已经使这一过程变得复杂了。

尝试这样的事情

int segmentSize = 6;
String[] segments = new String[msg.length() / segmentSize + 1];

for (int i = 0; i < msg.length(); i += segmentSize) {
    // ensure we don't try to access out of bounds indexes
    int lastIndex = Math.min(msg.length(), i+segmentSize);
    int segmentNumber = i/segmentSize;
    segments[segmentNumber] = msg.substring(i, lastIndex);
}

这会将分段放入该名称的数组中。 Math.min(msg.length(), i+segmentSize)确保您不会尝试将字符拉到字符串的末尾,这就是导致您提到的StringIndexOutOfBounds错误的原因。

您可以执行其他操作,而不是将它们放入数组中。如果您的最终目标是将更长的字符串合并到这些段中,则可以在for循环之外创建一个StringBuilder(例如在声明segments数组的位置),然后可以根据需要在循环内追加到该字符串并访问结果循环后(即sb.toString()),而不必在每次循环迭代时都创建StringBuilder的新实例。

答案 1 :(得分:2)

这是使用Java8流的简洁实现:

String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";
final AtomicInteger counter = new AtomicInteger(0);
Collection<String> strings = fullText.chars()
                                    .mapToObj(i -> String.valueOf((char)i) )
                                    .collect(Collectors.groupingBy(it -> counter.getAndIncrement() / 6
                                                                ,Collectors.joining()))
                                    .values();

输出:

[Syria , offici, ally k, nown a, s the , Syrian,  Arab , Republ, ic, is,  a cou, ntry i, n West, ern As, ia...]

答案 2 :(得分:1)

您还可以使用正则表达式分割第n个字符,该字符每6个字符精确地分割一次

String s ="anldhhdhdhhdhdhhdhdhdhdhdhd";
String[] str = s.split("(?<=\\G.{6})");
System.out.println(Arrays.toString(str));

输出:

[anldhh, dhdhhd, hdhhdh, dhdhdh, dhd]

答案 3 :(得分:1)

为什么不使用本质上以6为增量迭代的while循环,直到剩下不到6个字符?

我不确定您如何使用这些细分,因此现在我只剩下与您提供的预期示例输出类似的打印语句:

public class StringSegmenter {

    private static final int SEG_LENGTH = 6;
    private static final String PREFIX = "Segment%s: %s\n";

    public static void main(String[] args) {
        String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";

        int position = 0;
        int length = fullText.length();
        int segmentationCount = 0;

        // Checks that remaining characters are greater than 6, then prints segment
        // If less than 6 characters remain, prints remainder and exits loop.
        while (position < length) {
            segmentationCount++;

            if ((length - position) < SEG_LENGTH) {

                // Replace this with logging, or StringBuilder appending, etc...
                System.out.printf(PREFIX, segmentationCount, fullText.substring(position, length - 1));
                break;
            }
            // Replace this with logging, or StringBuilder appending, etc...
            System.out.printf(PREFIX, segmentationCount, fullText.substring(position, position + SEG_LENGTH));
            position += SEG_LENGTH;
        }
    }
}