Question

上次我在比较插入排序的两种不同实现的运行时间时发布了类似的问题。我现在有类似的问题。我知道堆排序的复杂性是O（nlogn），与普通情况下的快速排序相同。但是，当我对一组随机生成的大小为10,000的整数进行排序时，这是我的结果。

快速排序：执行时间：0.005288

堆排序：执行时间：0.234245

正如您所看到的，我的堆排序花费的时间远远超过应有的时间。

这是我的max_heapify，build_max_heap和heapsort函数的代码：

void max_heapify(int *a,int n,int i)
{
    int largest = 0;

    if((2*i)<=n && a[2*i] > a[i]) largest = 2*i;
    else largest = i;

    if((2*i+1)<=n && a[2*i+1] > a[largest]) largest = 2*i+1;

    if(largest != i) 
    {
        swap(a[i],a[largest]);
        max_heapify(a,n,largest);
    }
}

void build_max_heap(int *a,int n)
{
    for(int i=1;i<=n/2;i++)
    {
        max_heapify(a,n,i);
    }
}

void heapsort(int *a,int n)
{   

    while(n>0)
    {
        build_max_heap(a,n);
        swap(a[1],a[n--]);
    }
}

有人能告诉我实施中的缺陷吗？代码有效。我亲自测试了较小的样本量，但效率不高。

感谢任何帮助。谢谢！

Answer 1

我猜重复的build_max_heap（）就是缺陷。 Heapsort构建一次堆树，然后重复删除根节点中的最大元素 GIF Animation中的wikipedia是一个很好理解的样本我用随机字符串评估了3种算法。 N = 100,000

qsort(3)      usec = 54011  call = 0        compare = 1536365  copy = 0
qsort_trad()  usec = 67603  call = 99999    compare = 2368481  copy = 1344918
heap_sort()   usec = 88814  call = 1624546  compare = 3019351  copy = 1963682

qsort（3）是GNU C库中的合并排序索引排序。 qsort_trad()是常规的三分之一快速排序。 heap_sort()是我的实施概要与qsort（3）相同。

Answer 2

您正在尝试根据单个随机样本进行确定（无论样本中有多少元素），这在极端情况下是不可靠的。
< / LI>
差异可能是由算法之外的代码造成的（可能第一次算法运行需要时间来初始化不需要再次初始化的东西）。这也可以通过运行多个样本来解决（例如，通过在算法之间交替）。

Answer 3

FWIW - 我发现排序30000元素产生了几个无序的元素。

void check(const int *a, int n)
{     
      int i=1;
      for(i=1; i<n; i++)
      {
         if(a[i]< a[i-1])
         {
            printf("out of order a[%d]=%d a[%d]=%d\n", i,a[i], i-1, a[i-1]);

         }
      }       
}

制作了这个：

out of order a[1]=0 a[0]=1171034357
out of order a[29989]=2147184723 a[29988]=2147452680
out of order a[29991]=2146884982 a[29990]=2147187654
out of order a[29993]=2145044438 a[29992]=2147179272
out of order a[29995]=2145040183 a[29994]=2146044388
out of order a[29996]=2130746386 a[29995]=2145040183
out of order a[29997]=2062733752 a[29996]=2130746386
out of order a[29998]=2026139713 a[29997]=2062733752
out of order a[29999]=1957918169 a[29998]=2026139713

监测：

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 42.57      1.06     1.06                             _mcount_private
 28.11      1.76     0.70 225000000     0.00     0.00  max_heapify
 19.68      2.25     0.49                             __fentry__
  9.64      2.49     0.24    30000     0.01     0.03  build_max_heap
  0.00      2.49     0.00   427498     0.00     0.00  swap
  0.00      2.49     0.00        1     0.00     0.00  check
  0.00      2.49     0.00        1     0.00   940.00  heapsort

忽略_mcount_private它来自分析。

考虑： 1.确保您的代码正常运行。 2.然后担心执行速度。 3 @ScottHunter（+1）是正确的 - 你的分析是有缺陷的，所以同样的意思是上面的简介。

不同的30000个随机数的第二个配置文件：

$ ./heapsort
out of order a[1]=0 a[0]=1451707585
out of order a[29989]=2147418839 a[29988]=2147447414
out of order a[29990]=2147336854 a[29989]=2147418839
out of order a[29991]=2147314086 a[29990]=2147336854
out of order a[29992]=2147202205 a[29991]=2147314086
out of order a[29993]=2147177981 a[29992]=2147202205
out of order a[29994]=2133236301 a[29993]=2147177981
out of order a[29996]=2100055600 a[29995]=2137349904
out of order a[29998]=2044161181 a[29997]=2108306471
out of order a[29999]=1580699248 a[29998]=2044161181

Owner@Owner-PC ~
$ gprof heapsort | more
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 45.20      1.13     1.13                             _mcount_private
 30.00      1.88     0.75 225000000     0.00     0.00  max_heapify
 13.20      2.21     0.33                             __fentry__
 11.60      2.50     0.29    30000     0.00     0.00  build_max_heap
  0.00      2.50     0.00   427775     0.00     0.00  swap
  0.00      2.50     0.00        1     0.00     0.00  check
  0.00      2.50     0.00        1     0.00     1.04  heapsort

堆排序运行时间

3 个答案: