Question

基于that answer这里有两个版本的merge函数用于mergesort。你能帮我理解为什么第二个更快。我已经测试了50000的列表，第二个的速度提高了8倍（Gist）。

def merge1(left, right):
    i = j = inv = 0
    merged = []
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
            inv += len(left[i:])

    merged += left[i:]
    merged += right[j:]
    return merged, inv

def merge2(array1, array2):
    inv = 0
    merged_array = []
    while array1 or array2:
        if not array1:
            merged_array.append(array2.pop())
        elif (not array2) or array1[-1] > array2[-1]:
            merged_array.append(array1.pop())
            inv += len(array2)
        else:
            merged_array.append(array2.pop())
    merged_array.reverse()
    return merged_array, inv

这是排序功能：

def _merge_sort(list, merge):
    len_list = len(list)
    if len_list < 2:
        return list, 0
    middle = len_list / 2
    left, left_inv   = _merge_sort(list[:middle], merge)
    right, right_inv = _merge_sort(list[middle:], merge)
    l, merge_inv = merge(left, right)
    inv = left_inv + right_inv + merge_inv
    return l, inv

import numpy.random as nprnd
test_list = nprnd.randint(1000, size=50000).tolist()

test_list_tmp = list(test_list) 
merge_sort(test_list_tmp, merge1)

test_list_tmp = list(test_list) 
merge_sort(test_list_tmp, merge2)

Answer 1

与kreativitea上面的答案类似，但有更多信息（我想！）

分析实际的合并函数，合并两个50K数组，

合并1

         311748 function calls in 15.363 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001   15.363   15.363 <string>:1(<module>)
        1   15.322   15.322   15.362   15.362 merge.py:3(merge1)
   221309    0.030    0.000    0.030    0.000 {len}
    90436    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

merge2

         250004 function calls in 0.104 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.104    0.104 <string>:1(<module>)
        1    0.074    0.074    0.103    0.103 merge.py:20(merge2)
    50000    0.005    0.000    0.005    0.000 {len}
   100000    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    0.014    0.000    0.014    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}

因此对于merge1，它是221309 len，90436 append，并且需要15.363秒。因此，对于merge2，它是50000 len，100000 append和100000 pop，需要0.104秒。

len和append pop都是O（1）（更多信息here），因此这些配置文件并未显示实际花费时间的内容，因为就这样，它应该更快，但只有约20％。

如果您只是阅读代码，那么原因实际上相当明显：

在第一种方法中，有这一行：

inv += len(left[i:])

因此，每次调用它时，都必须重建一个数组。如果你注释掉这一行（或者只是用inv += 1或其他东西替换它）那么它会比其他方法更快。这是增加时间的单一行。

注意到这是原因，可以通过改进代码来解决问题;将其更改为此以加快速度。执行此操作后，它将比merge2

更快

inv += len(left) - i

将其更新为：

def merge3(left, right):
    i = j = inv = 0
    merged = []
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
            inv += len(left) - i

    merged += left[i:]
    merged += right[j:]
    return merged, inv

Answer 2

您可以使用优秀的cProfile模块来帮助您解决此类问题。

>>> import cProfile
>>> a = range(1,20000,2)
>>> b = range(0,20000,2)
>>> cProfile.run('merge1(a, b)')
         70002 function calls in 0.195 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.184    0.184    0.195    0.195 <pyshell#7>:1(merge1)
        1    0.000    0.000    0.195    0.195 <string>:1(<module>)
    50000    0.008    0.000    0.008    0.000 {len}
    19999    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


>>> cProfile.run('merge2(a, b)')
         50004 function calls in 0.026 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.016    0.016    0.026    0.026 <pyshell#12>:1(merge2)
        1    0.000    0.000    0.026    0.026 <string>:1(<module>)
    10000    0.002    0.000    0.002    0.000 {len}
    20000    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    20000    0.005    0.000    0.005    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}

稍微查看信息后，看起来评论者是正确的 - 它不是len函数 - 它是字符串模块。比较事物的长度时，将调用字符串模块，如下所示：

>>> cProfile.run('0 < len(c)')
         3 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

切片列表时也会调用它，但这是非常快速操作。

>>> len(c)
20000000
>>> cProfile.run('c[3:2000000]')
         2 function calls in 0.011 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.011    0.011    0.011    0.011 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

TL; DR：字符串模块中的某些内容在第一个函数中占用0.195秒，在第二个函数中占用0.026秒。：显然，在inv += len(left[i:])中重建了数组这一行。

Answer 3

如果我不得不猜测，我会说它可能与从列表中删除元素的成本有关，从结尾删除（pop）比从开头删除要快。第二个有利于从列表末尾删除元素。

请参阅效果说明：http://effbot.org/zone/python-list.htm

“删除项目所需的时间与在同一位置插入项目所需的时间大致相同;最后删除项目的速度很快，删除项目的时间很慢。”

为什么这个版本的mergesort更快

3 个答案:

合并1

merge2

如果您只是阅读代码，那么原因实际上相当明显：

将其更新为：