大约100个字符和下一个符号后拆分字符串(Java)

时间:2016-06-23 10:12:21

标签: java regex string split

我想在约。之后拆分一个字符串。 200个字符或下一个特殊标志:

字符串的格式类似于<data>|...|<data>|,其中一个<data>块位于30到70个字符之间。

我想要的结果是像

这样的String数组
<data>|<data>|
<data>|
<data>|<data>|<data>|

每行约200个字符。

我的代码看起来像

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.junit.Test;

public class RegexpTest {

@Test
public void testRegexp() throws Exception {
    String data = "Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|";
    String pat = ".{1,200}(\\d|\\s|\\w|\\.|\\:{1,70})\\|";
    String ans = data.replaceAll(pat, "X");
    //Pattern regex = Pattern.compile(pat);
    //Matcher regexMatcher = regex.matcher(str);

    System.out.println(data.length()); //prints 528
    System.out.println(ans.length()); //prints 3
}
}

结果产生正确数量的替换(3),但总体结果应该是String数组。

是否有可以处理此问题的正则表达式(类似于SO Q&A)?带有for循环的解决方案也是可以接受的。

Scratch Pad

随意测试regex101.com(包括我的尝试和测试数据)

1 个答案:

答案 0 :(得分:2)

没有正则表达式。只需将数据拆分为&#34; |&#34;。然后检查是否向现有行添加零件将超过200个字符。如果是,则开始新的一行。又快又脏:

编辑:添加评论和格式

public static void main(String[] args) {
    // your data
    String data = "Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|Symbol Ticker:1466654463000:157.71:TRADE:42|";
    // do the split
    List<String> out = new Test().splitToApproxAt(data, 200);
    // print the splitted lines
    for(String o : out){
        System.out.println(o);
    }
}

public List<String> splitToApproxAt(String data, int len){
    // split at the pipe symbol "|"
    String[] parts = data.split("\\|");

    // this will be our current line in progress
    String line = "";

    // this will store the lines up to 200 chars
    List<String> out = new ArrayList<String>();

    // for every data-part
    for( String part : parts ){
        if(part.length() > len){
            System.out.println("oh shit, what do?");
            continue;
        }
        // would this exceed the 200 chars?
        if( line.length() + part.length() > len){
            // yes! add previous line to output
            // and start a new one.
            out.add(line);
            line = part;
        }else{
            // no we can attach that to the
            // current line
            if(line.length()>0){
                // delimit with pipe
                line += "|" +part;
            }else{
                // line was empty, no pipe
                line = part;
            }
        }
    }
    // add the last line to the output
    out.add(line);
    return out;
}