如何在Quickselect中实现Hoare分区方案?

时间:2019-04-10 08:29:22

标签: python quicksort partition quickselect

我尝试将Hoare partition scheme作为Quickselect算法的一部分来实现,但每次似乎都能给我各种各样的答案。

这是findKthBest函数,用于在给定数组(data和其中元素数(low = 0high = 4的情况下,找到数组中第K个最大数如果是5个元素):

def findKthBest(k, data, low, high):
    # choose random pivot
    pivotindex = random.randint(low, high)

    # move the pivot to the end
    data[pivotindex], data[high] = data[high], data[pivotindex]

    # partition
    pivotmid = partition(data, low, high, data[high])

    # move the pivot back
    data[pivotmid], data[high] = data[high], data[pivotmid]

    # continue with the relevant part of the list
    if pivotmid == k:
        return data[pivotmid]
    elif k < pivotmid:
        return findKthBest(k, data, low, pivotmid - 1)
    else:
        return findKthBest(k, data, pivotmid + 1, high)

函数partition()获得四个变量:

  • data(例如5个元素的列表),
  • l(列表中相关部分的开始位置,例如0)
  • r(列表中相关零件的末端位置,也放置了枢轴,例如4)
  • pivot(枢轴的值)
def partition(data, l, r, pivot):
    while True:
        while data[l] < pivot:
            #statistik.nrComparisons += 1
            l = l + 1
        r = r - 1    # skip the pivot
        while r != 0 and data[r] > pivot:
            #statistik.nrComparisons += 1
            r = r - 1
        if r > l:
            data[r], data[l] = data[l], data[r]
        return r

现在,我每次都简单地得到各种结果,并且看来递归不能很好地工作(有时以达到最大递归错误而告终),而不是每次都给出恒定的结果。我在做什么错了?

1 个答案:

答案 0 :(得分:0)

首先,函数partition()似乎有误

如果您将代码与Wiki中的代码进行仔细比较,您会发现不同之处。该函数应为:

def partition(data, l, r, pivot):
    while True:
        while data[l] < pivot:
            #statistik.nrComparisons += 1
            l = l + 1
        r = r - 1    # skip the pivot
        while r != 0 and data[r] > pivot:
            #statistik.nrComparisons += 1
            r = r - 1
        if r >= l:
            return r

        data[r], data[l] = data[l], data[r]

第二,例如:

  • 分区后,您得到一个data = [1, 0, 2, 4, 3]的数组pivotmid=3
  • 您想找到第4个最大值(k=4),即1

解析为data的下一个数组findKthBest()将成为[1, 0]
因此,下一个findKthBest()应该找到数组[1, 0]的最大值:

def findKthBest(k, data, low, high):
    ......

    # continue with the relevant part of the list
    if pivotmid == k:
        return data[pivotmid]
    elif k < pivotmid:
        #Corrected
        return findKthBest(k-pivotmid, data, low, pivotmid - 1)
    else:
        return findKthBest(k, data, pivotmid + 1, high)