Question

我为一项作业创建了这个程序，我们需要在其中创建 Quichesort 的实现。这是一种混合排序算法，它使用Quicksort直到达到某个递归深度（log2（N），其中N是列表的长度），然后切换到Heapsort，以避免超过最大递归深度。

在测试我的实现时，我发现虽然它通常比常规Quicksort表现更好，但Heapsort一直表现优于两者。 任何人都可以解释为什么Heapsort表现更好，在什么情况下Quichesort会比Quicksort 和 Heapsort更好？

请注意，出于某种原因，分配将算法称为＆＃34; Quipsort＆＃34;。

编辑：显然，＆＃34; Quichesort＆＃34;实际上是相同的 Introsort

我还注意到medianOf3()函数中存在逻辑错误导致它为某些输入返回错误的值。这是一个改进功能版本：

def medianOf3(lst):
    """
    From a lst of unordered data, find and return the the median value from
    the first, middle and last values.
    """

    first, last = lst[0], lst[-1]
    if len(lst) <= 2:
        return min(first, last)
    middle = lst[(len(lst) - 1) // 2]
    return sorted((first, middle, last))[1]

这会解释算法的性能相对较差吗？

Quichesort代码：

import heapSort             # heapSort
import math                 # log2 (for quicksort depth limit)

def medianOf3(lst):
    """
    From a lst of unordered data, find and return the the median value from
    the first, middle and last values.
    """

    first, last = lst[0], lst[-1]
    if len(lst) <= 2:
        return min(first, last)
    median = lst[len(lst) // 2]
    return max(min(first, median), min(median, last))

def partition(pivot, lst):
   """
   partition: pivot (element in lst) * List(lst) -> 
        tuple(List(less), List(same, List(more))).  
   Where:
        List(Less) has values less than the pivot
        List(same) has pivot value/s, and
        List(more) has values greater than the pivot

   e.g. partition(5, [11,4,7,2,5,9,3]) == [4,2,3], [5], [11,7,9]
   """

   less, same, more = [], [], []
   for val in lst:
      if val < pivot:
         less.append(val)
      elif val > pivot:
         more.append(val)
      else:
         same.append(val)
   return less, same, more

def quipSortRec(lst, limit):
    """
    A non in-place, depth limited quickSort, using median-of-3 pivot.
    Once the limit drops to 0, it uses heapSort instead.
    """

    if lst == []:
        return []

    if limit == 0:
        return heapSort.heapSort(lst)

    limit -= 1
    pivot = medianOf3(lst)
    less, same, more = partition(pivot, lst)
    return quipSortRec(less, limit) + same + quipSortRec(more, limit)

def quipSort(lst):
    """
    The main routine called to do the sort.  It should call the
    recursive routine with the correct values in order to perform
    the sort
    """

    depthLim = int(math.log2(len(lst)))
    return quipSortRec(lst, depthLim)

Heapsort代码：

import heapq    # mkHeap (for adding/removing from heap)

def heapSort(lst):
    """
    heapSort(List(Orderable)) -> List(Ordered)
        performs a heapsort on 'lst' returning a new sorted list
    Postcondition: the argument lst is not modified
    """

    heap = list(lst)
    heapq.heapify(heap)
    result = []
    while len(heap) > 0:
        result.append(heapq.heappop(heap))
    return result

Answer 1

基本事实如下：

Heapsort具有最差的O（n log（n））性能，但在实践中往往很慢。
Quicksort平均有O（n log（n））性能，但在最坏的情况下有O（n ^ 2），但实际上很快。
Introsort旨在利用快速排序的快速实践性能，同时仍然保证堆的最坏情况O（n log（n））行为。

要问的一个问题是，why is quicksort faster "in practice" than heapsort?这是一个难以回答的问题，但大多数答案都指出快速排序有多好spatial locality，从而减少了缓存未命中数。但是，我不确定这对Python是否适用，因为它在解释器中运行并且在其他语言（例如C）下可能会干扰缓存性能的垃圾更多。

至于为什么你的特定内部实现比Python的heapsort慢 - 再次，这很难确定。首先，请注意heapq模块是written in Python，因此它与您的实现相对平衡。创建和连接许多较小的列表可能代价很高，因此您可以尝试重写快速排序以就地执行操作，看看是否有帮助。您还可以尝试调整实现的各个方面，以查看它如何影响性能，或通过分析器运行代码并查看是否存在任何热点。但最后我认为你不太可能找到明确的答案。它可能只是归结为Python解释器中哪些操作特别快或慢。

Quichesort的好处

Quichesort代码：

Heapsort代码：

1 个答案: