处理大量“long”时,这段代码有什么问题?

时间:2011-09-12 13:13:10

标签: java math puzzle

我编写了一个实用程序类来编码带有基数N的自定义numeral system中的数字。作为任何自尊的Java程序员,我然后编写了一个单元测试来检查代码是否按预期工作(对于任何我能做的数字)抛出它。)

事实证明,对于少数人来说,它起作用了。但是,对于足够大的数字,测试失败了。

代码:

public class EncodeUtil {

    private String symbols;

    private boolean isCaseSensitive;
    private boolean useDefaultSymbols;

    private int[] symbolLookup = new int[255];

    public EncodeUtil() {
        this(true);
    }

    public EncodeUtil(boolean isCaseSensitive) {
        this.useDefaultSymbols = true;
        setCaseSensitive(isCaseSensitive);
    }

    public EncodeUtil(boolean isCaseSensitive, String symbols) {
        this.useDefaultSymbols = false;
        setCaseSensitive(isCaseSensitive);
        setSymbols(symbols);
    }

    public void setSymbols(String symbols) {
        this.symbols = symbols;
        fillLookupArray();
    }

    public void setCaseSensitive(boolean isCaseSensitive) {
        this.isCaseSensitive = isCaseSensitive;
        if (useDefaultSymbols) {
            setSymbols(makeAlphaNumericString(isCaseSensitive));
        }
    }

    private void fillLookupArray() {
        //reset lookup array
        for (int i = 0; i < symbolLookup.length; i++) {
            symbolLookup[i] = -1;
        }
        for (int i = 0; i < symbols.length(); i++) {
            char c = symbols.charAt(i);
            if (symbolLookup[(int) c] == -1) {
                symbolLookup[(int) c] = i;
            } else {
                throw new IllegalArgumentException("duplicate symbol:" + c);
            }
        }
    }

    private static String makeAlphaNumericString(boolean caseSensitive) {
        StringBuilder sb = new StringBuilder(255);
        int caseDiff = 'a' - 'A';
        for (int i = 'A'; i <= 'Z'; i++) {
            sb.append((char) i);
            if (caseSensitive) sb.append((char) (i + caseDiff));
        }
        for (int i = '0'; i <= '9'; i++) {
            sb.append((char) i);
        }
        return sb.toString();
    }

    public String encodeNumber(long decNum) {
        return encodeNumber(decNum, 0);
    }

    public String encodeNumber(long decNum, int minLen) {
        StringBuilder result = new StringBuilder(20);
        long num = decNum;
        long mod = 0;
        int base = symbols.length();
        do {
            mod = num % base;
            result.append(symbols.charAt((int) mod));
            num = Math.round(Math.floor((num-mod) / base));
        } while (num > 0);
        if (result.length() < minLen) {
            for (int i = result.length(); i < minLen; i++) {
                result.append(symbols.charAt(0));
            }
        }
        return result.toString();
    }

    public long decodeNumber(String encNum) {
        if (encNum == null) return 0;
        if (!isCaseSensitive) encNum = encNum.toUpperCase();
        long result = 0;
        int base = symbols.length();
        long multiplier = 1;
        for (int i = 0; i < encNum.length(); i++) {
            char c = encNum.charAt(i);
            int pos = symbolLookup[(int) c];
            if (pos == -1) {
                String debugValue = encNum.substring(0, i) + "[" + c + "]";
                if (encNum.length()-1 > i) {
                    debugValue += encNum.substring(i + 1);
                }
                throw new IllegalArgumentException(
                    "invalid symbol '" + c + "' at position " 
                    + (i+1) + ": " + debugValue);
            } else {
                result += pos * multiplier;
                multiplier = multiplier * base;
            }
        }
        return result;
    }

    @Override
    public String toString() {
        return symbols;
    }

}

测试:

public class EncodeUtilTest {

    @Test
    public void testRoundTrip() throws Exception {
        //for some reason, numbers larger than this range will not be decoded correctly
        //maybe some bug in JVM with arithmetic with long values?
        //tried also BigDecimal, didn't make any difference
        //anyway, it is highly improbable that we ever need such large numbers
        long value = 288230376151711743L;
        test(value, new EncodeUtil());
        test(value, new EncodeUtil(false));
        test(value, new EncodeUtil(true, "1234567890qwertyuiopasdfghjklzxcvbnm"));
    }

    @Test
    public void testRoundTripMax() throws Exception {
        //this will fail, see above
        test(Long.MAX_VALUE, new EncodeUtil());
    }

    @Test
    public void testRoundTripGettingCloserToMax() throws Exception {
        //here we test different values, getting closer to Long.MAX_VALUE
        //this will fail, see above
        EncodeUtil util = new EncodeUtil();
        for (long i = 1000; i > 0; i--) {
            System.out.println(i);
            test(Long.MAX_VALUE / i, util);
        }
    }

    private void test(long number, EncodeUtil util) throws Exception {
        String encoded = util.encodeNumber(number);
        long result = util.decodeNumber(encoded);
        long diff = number - result;
        //System.out.println(number + " = " + encoded + " diff " + diff);
        assertEquals("original=" + number + ", result=" + result + ", encoded=" + encoded, 0, diff);
    }

}

当价值变大时,事情开始失败的任何想法?我也试过BigInteger,但它似乎没有什么区别。

2 个答案:

答案 0 :(得分:7)

您在encodeNumber方法中使用浮点数学,这使得您的代码依赖于double类型的精度。

更换

num = Math.round(Math.floor((num-mod) / base));

num = (num - mod) / base;

使测试通过。实际上

num = num / base;

同样可以正常工作(思考实验:当19 / 10是整数除法时/是什么?)。

答案 1 :(得分:2)

您的代码转换为加倍,这可能会为较大的值生成奇怪的结果。

num = Math.round(Math.floor((num-mod) / base));

这将是我的第一个停靠港。