appendCodePoint()和codePointAt()

时间:2014-12-22 06:04:14

标签: java string

为什么以下程序打印错误,我必须做出哪些更改才能使其打印为真?

public class Main {

    static int[] codePoints(String s) {
        int n = s.length();
        int[] temp = new int[n];
        for (int i = 0; i < n; i++)
            temp[i] = s.codePointAt(i);
        return temp;
    }

    static String construct(int[] codePoints) {
        StringBuilder sb = new StringBuilder();
        for (int i : codePoints)
            sb.appendCodePoint(i);
        return sb.toString();
    }

    public static void main(String[] args) {
        StringBuilder sb = new StringBuilder("The symbol ");
        sb.appendCodePoint(Character.MAX_VALUE + 1);
        sb.append(" is not in the Basic Multilingual Plane.");
        String s = sb.toString();
        System.out.println(s.equals(construct(codePoints(s))));
    }
}

1 个答案:

答案 0 :(得分:5)

问题在于:

static int[] codePoints(String s) {
    int n = s.length();
    int[] temp = new int[n];
    for (int i = 0; i < n; i++)
        temp[i] = s.codePointAt(i); // <-- HERE
    return temp;
}

BMP之外的代码点是两个 char s宽,而不是一个(见Character.toChars());如果遇到这样的代码点,你需要检查并推进你的索引:

static int[] codePoints(final String s)
{
    final int len = s.length();
    final int[] ret = new int[s.codePointCount(0, len)];
    int nrCodePoints = 0;
    int codePoint;
    int index;
    for (index = 0; index < len; index++) {
        codePoint = s.codePointAt(index);
        ret[nrCodePoints++] = codePoint;
        if (codePoint > Character.MAX_VALUE)
            index++;
    }
    return ret;
}