Question

我想在整数数组中找到第一个整数的索引，它是＆lt; = key。我可以在log2（N）+1比较中使用二进制搜索来完成。不应该只用log2（N）进行比较吗？

// Returns the index of the first integer in keys <= key. size must be a power of 2.
unsigned find(int key, int* keys, unsigned size) {    
    int* origKeys = keys;

    for (int i = 0; i < log2(size); i++) {
        size /= 2;

        if (keys[size] < key)
            keys += size;
    }

    unsigned i = unsigned(keys - origKeys);
    // Not only is this step messy, because it doesn't fit the loop, but
    // now we have log2(n) + 1 comparisons.
    if (*keys < key)
        i++;

    return i;
}

Answer 1

让我们从信息理论的角度思考这个问题。如果你有一个包含n个元素的数组，那么新元素可以有n + 1个可能的位置：就在数组的任何元素之前，或者在所有元素之后。因此，您的算法需要进行足够的比较，以便能够唯一地识别哪个n + 1个点是正确的。如果你没有进行足够的比较，那么你回答的答案并不总是正确的。

在最好的情况下，您所做的每一次比较都可以消除一半的可能位置。因此，在理论极限中，通过k比较，您可以确定2 ^ k个位置中的哪个是正确的。由于存在n + 1个可能的位置，因此在最坏的情况下需要进行lg（n + 1）比较，而不是lg n。由于你的n是2的完美幂，这意味着需要进行一次额外的比较。另一方面，如果你的n小于2的完美幂，那么进行ceil（lg n）比较就足够了。

由Eloff编辑，这段代码似乎正如您预测的那样给出正确的log2（n + 1）步骤：

// Returns the index of the first integer in keys <= key. size must be one less than a power of 2.
unsigned find(int key, int* keys, unsigned size) {    
    int* origKeys = keys;
    size++;

    while(size > 1) {
        size /= 2;

        if (keys[size-1] < key)
            keys += size;
    }

    return unsigned(keys - origKeys);        
}

为什么我的二进制搜索需要额外的比较？ LOG2（N）+1

1 个答案: