连接两个排序的整数列表的更好方法

时间:2016-01-06 16:18:57

标签: python algorithm optimization

让我们假设我有一个列表和另一个元组,它们都已经排序了:

A = [10, 20, 30, 40]
B = (20, 60, 81, 90)

我需要的是将A中的所有元素添加到A中,使A保持排序。

我能来的解决方案是:

for item in B:
    for i in range(0, len(A)):
        if item > A[i]:
            i += 1
        else: 
            A.insert(i, item)

假设A大小为m,B大小为n;在最坏的情况下,这个解决方案需要O(m x n),我怎样才能让它表现更好?

6 个答案:

答案 0 :(得分:11)

一种简单的方法是heapq.merge

A = [10, 20, 30, 40]

B = (20, 60, 81, 90)

from heapq import merge

for ele in merge(A,B):
    print(ele)

输出:

10
20
20
30
40
60
81
90

使用其他O(n)解决方案的一些时间安排:

In [53]: A = list(range(10000))

In [54]: B = list(range(1,20000,10))

In [55]: timeit list(merge(A,B))
100 loops, best of 3: 2.52 ms per loop

In [56]: %%timeit
C = []
i = j = 0
while i < len(A) and j < len(B):
    if A[i] < B[j]:
        C.append(A[i])
        i += 1
    else:
        C.append(B[j])
        j += 1
C += A[i:] + B[j:]
   ....: 
100 loops, best of 3: 4.29 ms per loop
In [58]: m =list(merge(A,B))
In [59]: m == C
Out[59]: True

如果您想自己动手,这比合并要快一些:

def merger_try(a, b):
    if not a or not b:
        yield chain(a, b)
    iter_a, iter_b = iter(a), iter(b)
    prev_a, prev_b = next(iter_a), next(iter_b)
    while True:
        if prev_a >= prev_b:
            yield prev_b
            try:
                prev_b = next(iter_b)
            except StopIteration:
                yield prev_a
                break
        else:
            yield prev_a
            try:
                prev_a = next(iter_a)
            except StopIteration:
                yield prev_b
                break
    for ele in chain(iter_b, iter_a):
        yield ele

一些时间:

In [128]: timeit list(merge(A,B))
1 loops, best of 3: 771 ms per loop

In [129]: timeit list(merger_try(A,B))
1 loops, best of 3: 581 ms per loop

In [130]: list(merger_try(A,B))  == list(merge(A,B))
Out[130]: True

In [131]: %%timeit                                 
C = []
i = j = 0
while i < len(A) and j < len(B):
    if A[i] < B[j]:
        C.append(A[i])
        i += 1
    else:
        C.append(B[j])
        j += 1
C += A[i:] + B[j:]
   .....: 
1 loops, best of 3: 919 ms per loop

答案 1 :(得分:4)

Short, Self Contained, Correct Example (SSCCE) module&#34; 支持按排序顺序维护列表,而无需在每次插入后对列表进行排序&#34;:

JFrame

此解决方案不会创建新列表。

请注意import bisect for b in B: bisect.insort(A, b) 相当于

bisect.insort(A, b)

即使搜索速度很快( O(log n)),插入也很慢(bisect)。

答案 2 :(得分:4)

在这篇文章中有很多好的讨论!争论时间很难,所以我写了一些时序脚本。这是相当简陋的,但我认为它现在会做。我也附上了结果。

import timeit
import math
import matplotlib.pyplot as plt
from collections import defaultdict


setup = """
import bisect
import heapq
from random import randint


A = sorted((randint(1, 10000) for _ in range({})))
B = sorted((randint(1, 10000) for _ in range({})))


def bisect_sol(A, B):
    for b in B:
        bisect.insort(A, b)


def merge_sol(A, B):
    ia = ib = 0
    while ib < len(B):
        if ia < len(A) and A[ia] < B[ib]:
            if ia < len(A):
                ia += 1
        else:
            A.insert(ia + 1, B[ib])
            ib += 1


def heap_sol(A, B):
    return heapq.merge(A, B)


def sorted_sol(A, B):
    return sorted(A + B)
"""


sols = ["bisect", "merge", "heap", "sorted"]
times = defaultdict(list)
iters = [100, 1000, 2000, 5000, 10000, 20000, 50000, 100000]

for n in iters:
    for sol in sols:
        t = min(timeit.repeat(stmt="{}_sol(A, B)".format(sol), setup=setup.format(n, n), number=1, repeat=5))
        print("({}, {}) done".format(n, sol))
        times[sol].append(math.log(t))

for sol in sols:
    plt.plot(iters, times[sol])
plt.xlabel("iterations")
plt.ylabel("log time")
plt.legend(sols)
plt.show()

结果如下:

Iterations vs. Time

很明显,修改列表是主要的瓶颈,因此创建新列表是可行的方法。

答案 3 :(得分:3)

以下是O(n)中的解决方案:

A = [10, 20, 30, 40]
B = [20, 60, 81, 90]
C = []
i = j = 0
while i < len(A) and j < len(B):
    if A[i] < B[j]:
        C.append(A[i])
        i += 1
    else:
        C.append(B[j])
        j += 1
C += A[i:] + B[j:]

答案 4 :(得分:1)

<强>编辑

l1 = [10,20,30,40]
l2 = (10,20,30,40)
l2 = list(l2)
l1 = sorted(l1+l2)

答案 5 :(得分:1)

您需要执行合并。但是传统的&#34; merge生成一个新列表;因此,您需要进行一些修改才能展开一个列表。

ia = ib = 0
while ib < len(B):
    if ia < len(A) and A[ia] < B[ib]:
        if ia < len(A):
            ia += 1
    else:
        A.insert(ia + 1, B[ib])
        ib += 1