TimSort什么时候抱怨破坏的比较器?

时间:2014-07-25 08:25:27

标签: java java-7 timsort

Java 7 changed the sorting algorithm使其抛出

  

java.lang.IllegalArgumentException:"比较方法违反了其总合同!"

在某些情况下使用的比较器有问题。是否可以判断比较器中的哪种错误导致这种情况?在我的实验中,如果x!= x无关紧要,如果x

(如果对此有一般规则,可能更容易在比较器中查找错误。但当然最好修复所有错误。:-))

特别是,以下两个比较器没有让TimSort抱怨:

    final Random rnd = new Random(52);

    Comparator<Integer> brokenButNoProblem1 = new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            if (o1 < o2) {
                return Compare.LESSER;
            } else if (o1 > o2) {
                return Compare.GREATER;
            }
            return rnd.nextBoolean() ? Compare.LESSER : Compare.GREATER;
        }
    };

    Comparator<Integer> brokenButNoProblem2 = new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            if (o1 == o2) {
                return Compare.EQUAL;
            }
            return rnd.nextBoolean() ? Compare.LESSER : Compare.GREATER;
        }
    };

但是下面的比较器确实让它失败了:

    Comparator<Integer> brokenAndThrowsUp = new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            if (Math.abs(o1 - o2) < 10) {
                return Compare.EQUAL; // WRONG and does matter
            }
            return Ordering.natural().compare(o1, o2);
        }
    };

更新:在一些现实生活中,我们遇到了失败,其中没有x,y,z,其中x = y且y = z但x < z。所以看起来我的猜测是错误的,而且它似乎并不仅仅是这种特殊的失败。有更好的想法吗?

2 个答案:

答案 0 :(得分:5)

查看ComparableTimSort的代码后,我不太确定。我们来分析吧。这是抛出它的唯一方法(有一种类似的方法只对交换的角色做同样的事情,因此分析其中一个就足够了。)

private void mergeLo(int base1, int len1, int base2, int len2) {
        assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

        // Copy first run into temp array
        Object[] a = this.a; // For performance
        Object[] tmp = ensureCapacity(len1);

        int cursor1 = tmpBase; // Indexes into tmp array
        int cursor2 = base2;   // Indexes int a
        int dest = base1;      // Indexes int a
        System.arraycopy(a, base1, tmp, cursor1, len1);

        // Move first element of second run and deal with degenerate cases
        a[dest++] = a[cursor2++];
        if (--len2 == 0) {
            System.arraycopy(tmp, cursor1, a, dest, len1);
            return;
        }
        if (len1 == 1) {
            System.arraycopy(a, cursor2, a, dest, len2);
            a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
            return;
        }

        int minGallop = this.minGallop;  // Use local variable for performance
    outer:
        while (true) {
            int count1 = 0; // Number of times in a row that first run won
            int count2 = 0; // Number of times in a row that second run won

            /*
             * Do the straightforward thing until (if ever) one run starts
             * winning consistently.
             */
// ------------------ USUAL MERGE
            do {
                assert len1 > 1 && len2 > 0;
                if (((Comparable) a[cursor2]).compareTo(tmp[cursor1]) < 0) {
                    a[dest++] = a[cursor2++];
                    count2++;
                    count1 = 0;
                    if (--len2 == 0)
                        break outer;
                } else {
                    a[dest++] = tmp[cursor1++];
                    count1++;
                    count2 = 0;
                    if (--len1 == 1)
                        break outer;
                }
            } while ((count1 | count2) < minGallop);

// ------------------ GALLOP
            /*
             * One run is winning so consistently that galloping may be a
             * huge win. So try that, and continue galloping until (if ever)
             * neither run appears to be winning consistently anymore.
             */
            do {
                assert len1 > 1 && len2 > 0;
                count1 = gallopRight((Comparable) a[cursor2], tmp, cursor1, len1, 0);
                if (count1 != 0) {
                    System.arraycopy(tmp, cursor1, a, dest, count1);
                    dest += count1;
                    cursor1 += count1;
                    len1 -= count1;
// -->>>>>>>> HERE IS WHERE GALLOPPING TOO FAR WILL TRIGGER THE EXCEPTION
                    if (len1 <= 1)  // len1 == 1 || len1 == 0
                        break outer;
                }
                a[dest++] = a[cursor2++];
                if (--len2 == 0)
                    break outer;

                count2 = gallopLeft((Comparable) tmp[cursor1], a, cursor2, len2, 0);
                if (count2 != 0) {
                    System.arraycopy(a, cursor2, a, dest, count2);
                    dest += count2;
                    cursor2 += count2;
                    len2 -= count2;
                    if (len2 == 0)
                        break outer;
                }
                a[dest++] = tmp[cursor1++];
                if (--len1 == 1)
                    break outer;
                minGallop--;
            } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
            if (minGallop < 0)
                minGallop = 0;
            minGallop += 2;  // Penalize for leaving gallop mode
        }  // End of "outer" loop
        this.minGallop = minGallop < 1 ? 1 : minGallop;  // Write back to field

        if (len1 == 1) {
            assert len2 > 0;
            System.arraycopy(a, cursor2, a, dest, len2);
            a[dest + len2] = tmp[cursor1]; //  Last elt of run 1 to end of merge
        } else if (len1 == 0) {
            throw new IllegalArgumentException(
                "Comparison method violates its general contract!");
        } else {
            assert len2 == 0;
            assert len1 > 1;
            System.arraycopy(tmp, cursor1, a, dest, len1);
        }
    }

该方法执行两个已排序运行的合并。它通常合并,但一旦遇到一方开始“赢”(即,总是小于另一方),就开始“驰骋”。 Gallopping试图通过向前看更多元素而不是一次比较一个元素来加快速度。由于运行应该排序,因此展望未来。

您会看到异常仅在最后len10时抛出。 第一个观察结果如下:在通常的合并期间,异常可以从不抛出,因为循环在len 1 MIN_GALLOP后直接中​​止。 因此,只能在疾驰的情况下抛出异常。

这已经强烈暗示异常行为是不可靠的:只要你有小数据集(如此之小以至于生成的运行可能永远不会驰骋,因为7gallopRight)或者生成的运行始终巧合生成一个永不磨损的合并,您将永远不会收到异常。因此,在不进一步检查{{1}}方法的情况下,我们可以得出结论:您不能依赖异常:无论您的比较器有多么错误,它都可能永远不会抛出

答案 1 :(得分:-1)

来自documentation

  

IllegalArgumentException - (可选)如果自然排序   发现数组元素违反了可比较合同

我在提到的合同上找不到多少,但恕我直言,它应该代表total order(即compareTo方法定义的关系必须是transitive,{ {3}}和antisymmetric)。如果不满足该要求,sort可能会抛出IllegalArgumentException。 (我说可能因为未能满足此要求可能会被忽视。)

编辑:添加指向使关系成为总订单的属性的链接。