String.codePointBefore()什么时候有用?

时间:2016-01-25 06:40:18

标签: java unicode

是否存在codePointBefore()有利的用例?如果您有索引,则可以codePointAt(i-1) ..?

1 个答案:

答案 0 :(得分:3)

代码点可能包含多个char仍然是only 16-bit unicodeString中的方法在其基础数组char[] value的索引中的索引,而不是代码点的索引。这些检查Character的边界和包装方法:

//Java 8 java.lang.String source code
public int codePointAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return Character.codePointAtImpl(value, index, value.length);
}
//...
public int codePointBefore(int index) {
    int i = index - 1;
    if ((i < 0) || (i >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return Character.codePointBeforeImpl(value, index, 0);
}

字符中的相应方法识别并组合属于单个代码点的多个char

//Java 8 java.lang.Character source code
static int codePointAtImpl(char[] a, int index, int limit) {
    char c1 = a[index];
    if (isHighSurrogate(c1) && ++index < limit) {
        char c2 = a[index];
        if (isLowSurrogate(c2)) {
            return toCodePoint(c1, c2);
        }
    }
    return c1;
}
//...
static int codePointBeforeImpl(char[] a, int index, int start) {
    char c2 = a[--index];
    if (isLowSurrogate(c2) && index > start) {
        char c1 = a[--index];
        if (isHighSurrogate(c1)) {
            return toCodePoint(c1, c2);
        }
    }
    return c2;
}

区别很重要,因为index-1并不总是前一个代码点的 start ;因此codePointBefore()需要从index-1开始向后看,而codePointAt()需要从index开始并向前看。