拆分/标记/扫描一个知道引号的字符串

时间:2010-07-01 18:23:06

标签: java string quotation-marks

对于拆分字符串,Java中是否有默认/简单方法,但是要使用引号或其他符号?

例如,鉴于此文:

There's "a man" that live next door 'in my neighborhood', "and he gets me down..."

获得:

There's
a man
that
live
next
door
in my neighborhood
and he gets me down

2 个答案:

答案 0 :(得分:5)

这样的东西适用于您的输入:

    String text = "There's \"a man\" that live next door "
        + "'in my neighborhood', \"and he gets me down...\"";

    Scanner sc = new Scanner(text);
    Pattern pattern = Pattern.compile(
        "\"[^\"]*\"" +
        "|'[^']*'" +
        "|[A-Za-z']+"
    );
    String token;
    while ((token = sc.findInLine(pattern)) != null) {
        System.out.println("[" + token + "]");
    }

以上打印(as seen on ideone.com):

[There's]
["a man"]
[that]
[live]
[next]
[door]
['in my neighborhood']
["and he gets me down..."]

它使用Scanner.findInLine,正则表达式模式是以下之一:

"[^"]*"      # double quoted token
'[^']*'      # single quoted token
[A-Za-z']+   # everything else

毫无疑问,这并不总是100%有效;引用可以嵌套等的情况将是棘手的。

参考

答案 1 :(得分:1)

根据您的逻辑令人怀疑,您可以区分撇号和单引号,即There'sin my neighborhood

如果你想要上面的东西,你必须开发某种配对逻辑。我正在考虑正则表达式。或者某种两部分解析。