如何在Java中拆分字符串,包括空格(如在Python中)

时间:2015-12-31 22:58:22

标签: java python string split

我正在将逗号分隔的列表读入Java,其中元素可能包含空格和单个空格。这里有几个示例行:

 ,achieve,achievement,achievable,,,    (note the space before the first comma)
agree,agreement,, ,agreeable,agreeably (note the space between commas)
,apartment,,                           (no spaces)

在Java中,使用String[]生成的line.split(",")会将所有空白元素更改为除尾随元素之外的空格(省略),如下所示:

" ", "achieve", "achievement", "achievable"
"agree", "agreement", " ", " ", "agreeable", "agreeably"
" ", "apartment"

我需要将所有空白元素渲染为空字符串,并将单个空格元素渲染为单个空格,如下所示:

" ", "achieve", "achievement", "achievable", "", "", ""
"agree", "agreement", "", " ", "agreeable", "agreeably"
"", "apartment", "", ""

如何在Java中执行此操作?

5 个答案:

答案 0 :(得分:4)

要避免删除尾随的空元素,请使用split(delimiter, limit),其值为limit,如

split(",", -1)

样本:

String[] tests = {
        " ,achieve,achievement,achievable,,,",
        "agree,agreement,, ,agreeable,agreeably",
        ",apartment,,"
};
for (String line : tests){
    String[] elements = line.split(",", -1);
    StringJoiner sj = new StringJoiner(  "\", \"",  "\"",   "\""); 
                                       //delimiter, prefix, suffix
    for (String element : elements){
        sj.add(element);
    }
    System.out.println(sj);
}

输出:

" ", "achieve", "achievement", "achievable", "", "", ""
"agree", "agreement", "", " ", "agreeable", "agreeably"
"", "apartment", "", ""

答案 1 :(得分:2)

如果要分割逗号和任何周围的空格,可以使用此

str.trim().split("\\s+,\\s+")

答案 2 :(得分:1)

如果你想复制Python的str.split()的确切行为,你需要修剪空格然后使用接受正则表达式的重载来匹配白色空格,如下所示:

str.trim().split("\\s+")

答案 3 :(得分:1)

这是一个简单的测试程序,我想这说明了你在寻找什么:

public class s1 {
    public static void main( String[] args ) {
//      String si = " ,achieve,achievement,achievable,,,";
//      String si = "agree,agreement,, ,agreeable,agreeably";
        String si = ",apartment,,";
        String[] so = si.split(" *, *", -1);   /* split on comma and any space(s) next to it */
        for (String s : so) {
            System.out.println('"' + s + '"');
        }
    }

}

答案 4 :(得分:0)

line.split(",")就像通过调用给定表达式和limit参数为零的双参数split方法一样工作。因此,尾随空字符串不包含在结果数组中。

相反,如果您使用public String[] split(String regex, int limit)并使用line.split(",", <any negative int>)调用它,则模式将被应用尽可能多次,并且数组可以具有任意长度。

  

因此,您可以将其称为line.split(",", -9)

以下是不同限制值的结果:

  • limit = 0 模式将被应用尽可能多次,数组可以有任意长度,并且尾随空字符串将被丢弃。
  • 限制&gt; 0 模式最多应用limit - 1次,数组的长度不大于n,数组的最后一个条目将包含超出最后匹配分隔符的所有输入
  • 限制&lt; 0 模式将被应用尽可能多次并且数组可以具有任何长度

检查doc以获得更多说明。