使用StringTokenizer模拟String.split

时间:2013-07-11 08:42:33

标签: java stringtokenizer

我正在从现有应用程序转换代码以针对Java 1.1编译器进行编译,以获得自定义硬件。这意味着我无法使用 String.split(regex)将现有字符串转换为数组。

我创建了一个方法,它应该给出与String.split(regex)相同的结果,但它有问题,我无法弄清楚是什么。

代码:

private static String[] split(String delim, String line) {
  StringTokenizer tokens = new StringTokenizer(line, delim, true);
  String previous = "";
  Vector v = new Vector();

  while(tokens.hasMoreTokens()) {
    String token = tokens.nextToken();

    if(!",".equals(token)) {
      v.add(token);
    } else if(",".equals(previous)) {
      v.add("");
    } else {
      previous = token;
    }
  }

  return (String[]) v.toArray(new String[v.size()]);
}

示例输入:

  

RM ^ RES,0013A2004081937F ,, 9060,1234FF

示例输出:

String line = "RM^RES,0013A2004081937F,,9060,1234FF";
String[] items = split(",", line);

for(String s : items) {
    System.out.println(" [ " + s + " ] ");
}
  

[RM ^ RES] [0013A2004081937F] [] [] [9060] [] [1234FF]

期望的输出:

  

[RM ^ RES] [0013A2004081937F] [] [9060] [1234FF]


我正在尝试转换的旧代码:

String line = "RM^RES,0013A2004081937F,,9060,1234FF";
String[] items = line.split(",");

for(String s : items) {
    System.out.println(" [ " + s + " ] ");
}
  

[RM ^ RES] [0013A2004081937F] [] [9060] [1234FF]

5 个答案:

答案 0 :(得分:4)

我修改了代码并对其进行了测试。它的工作原理(不要忘记避免硬编码“,”因此您可以将该函数用于任何分隔符):

private static String[] split(String delim, String line) {

    StringTokenizer tokens = new StringTokenizer(line, delim, true);
    String previous = delim;
    Vector v = new Vector();

    while (tokens.hasMoreTokens()) {
        String token = tokens.nextToken();

        if (!delim.equals(token)) {
            v.add(token);
        } else if (previous.equals(delim)) {
            v.add("");
        }
        previous = token;
    }

    return (String[]) v.toArray(new String[v.size()]);
}

答案 1 :(得分:1)

几乎一切都是对的。差不多,因为你忘了“清除”previous的价值。 试试这个:

if(!",".equals(token)) {
  v.add(token);
  previous = "";
} else if(",".equals(previous)) {
  v.add("");
  previous = "";
} else {
  previous = token;
}

答案 2 :(得分:0)

根本不使用StringTokenizer:

private static String[] split(String delim, String line) {
    String current = line;
    int index = line.indexOf(delim);
    Vector vector = new Vector();
    while (index != -1) {
        vector.add(current.substring(0, index));
        current = current.substring(index + 1);
        index = current.indexOf(delim);
    }
    vector.add(current);

    return (String[]) vector.toArray(new String[vector.size()]);
}

答案 3 :(得分:0)

你可以这样试试

 public static void main(String[] args) throws ParseException {
    for (String s : split(",", "RM^RES,0013A2004081937F, ,9060,1234FF")) {
        System.out.print(" [ " + s + " ] ");
    }
  }

private static String[] split(String delim, String line) {
    StringTokenizer tokens = new StringTokenizer(line, delim);
    String[] v = new String[tokens.countTokens()];
    int i = 0;
    while (tokens.hasMoreTokens()) {
        v[i] = tokens.nextToken();
        i++;
    }
    return v;
}

答案 4 :(得分:0)

我认为你不应该对底层分隔符做任何假设。

    public static String[] split(String line, String delim) {
        Vector v = new Vector();
        final String EMPTY_STRING = "";
        StringTokenizer st = new StringTokenizer(line, delim, true);
        while (st.hasMoreTokens()) {
            String token = st.nextToken();

            if (token.equals(delim)) {
                if (v.isEmpty() || v.size() > 0 && !EMPTY_STRING.equals(v.get(v.size() - 1))) {
                    v.add(EMPTY_STRING);
                }
            } else {
                v.add(token);
            }
        }

        return (String[])v.toArray(new String[v.size()]);
    }