从字符串到数组的单词,如果它们在斜杠之间,则不会

时间:2014-03-23 13:39:02

标签: java regex string split

我有这段代码:

String path; 
path = main.getInput(); // lets say getInput() is "Hello \Wo rld\"
args = path.split("\\s+");

for (int i = 0; i < args.length; i++) {
     System.out.println(args[i]);
}

有没有办法分割字符串,以便将单词拆分并放入数组中,但前提是它们在两个反斜杠之间,以便“Wo rld”为1字而不是两个?

3 个答案:

答案 0 :(得分:4)

您可以尝试仅在后跟偶数个反斜杠的空格上进行拆分。原始正则表达式:

\s+(?=(?:[^\\]*\\[^\\]*\\)*[^\\]*$)

Java转义正则表达式:

\\s+(?=(?:[^\\\\]*\\\\[^\\\\]*\\\\)*[^\\\\]*$)

ideone demo

答案 1 :(得分:1)

试试这个:

String s = "John Hello \\Wo rld\\ our world";
Pattern pattern = Pattern.compile("(\\\\.*?\\\\)|(\\S+)");
Matcher m = pattern.matcher(s);
while (m.find( )) {
    if(m.group(1) != null){
        System.out.println(m.group(1));
    }
    else{
        System.out.println(m.group(2));
    }
}

输出:

John
Hello
\Wo rld\
our
world

答案 2 :(得分:0)

如果它不必是正则表达式,那么您可以使用这个简单的解析器并在一次迭代中获得结果。

public static List<String> spaceSplit(String str) {
    List<String> tokens = new ArrayList<>();

    StringBuilder sb = new StringBuilder();
    boolean insideEscaped = false; //flag to check if I can split on space 

    for (char ch : str.toCharArray()) {

        if (ch == '\\') 
            insideEscaped = !insideEscaped;

        // we need to split only on spaces which are not in "escaped" area
        if (ch == ' ' && !insideEscaped) {
            if (sb.length() > 0) {
                tokens.add(sb.toString());
                sb.delete(0, sb.length());
            }
        } else //and add characters that are not spaces from between \
            sb.append(ch);
    }
    if (sb.length() > 0)
        tokens.add(sb.toString());

    return tokens;
}

用法:

for (String s : spaceSplit("hello \\wo rld\\"))
    System.out.println(s);

输出:

hello
\wo rld\