主要思想：如果值不唯一，请将它们设为唯一

Question

鉴于列表a = [1, 2, 2, 3]及其子列表b = [1, 2]，找到一个以sorted(a) == sorted(b + complement)的方式补充b的列表。在上面的示例中，complement将是[2, 3]的列表。

使用列表理解是很诱人的：

complement = [x for x in a if x not in b]

或设置：

complement = list(set(a) - set(b))

但是，这两种方式都会返回complement = [3]。

明显的做法是：

complement = a[:]
for element in b:
    complement.remove(element)

但这感觉非常不满意而且不是非常 Pythonic 。我错过了一个明显的习语还是这样？

正如下面指出的那样，性能是O(n^2)还有更有效的方法吗？

Answer 1

唯一更多的声明性以及 Pythonic 方式突然出现在我的脑海中，而提高了大型b的性能（和a）是使用某种减量计数器：

from collections import Counter

class DecrementCounter(Counter):

    def decrement(self,x):
        if self[x]:
            self[x] -= 1
            return True
        return False

现在我们可以使用列表理解：

b_count = DecrementCounter(b)
complement = [x for x in a if not b_count.decrement(x)]

这里我们跟踪b中的计数，a中的每个元素，我们看它是否属于b_count。如果确实如此，我们减少计数器并忽略该元素。否则我们将其添加到complement。请注意，只有当我们确定存在<{1}} 时，这才有效。

后构建了complement，您可以检查补码是否存在：

complement

如果这是not bool(+b_count)，那么无法构建此补码（例如False和a=[1]）。因此，完整的实施可能是：

b=[1,3]

如果字典查找在 O（1）（它通常只在极少数情况下是 O（n））中运行，则此算法在< em> O（| a | + | b |）（所以列表大小的总和）。而b_count = DecrementCounter(b) complement = [x for x in a if not b_count.decrement(x)] if +b_count: raise ValueError('complement cannot be constructed')方法通常在 O（| a |×| b |）中运行。

Answer 2

为了降低已经有效的方法的复杂性，您可以使用collections.Counter（这是一个快速查找的专用字典）来计算两个列表中的项目。

然后通过减去值来更新计数，最后通过仅保留计数为＆gt;的项来过滤列表。 0并使用itertools.chain

重建/链接它

from collections import Counter
import itertools

a  = [1, 2, 2, 2, 3]
b = [1, 2]

print(list(itertools.chain.from_iterable(x*[k] for k,x in (Counter(a)-Counter(b)).items() if x > 0)))

结果：

[2, 2, 3]

Answer 3

O（n log n）

a = [1, 2, 2, 3]
b = [1, 2]
a.sort()
b.sort()

L = []
i = j = 0
while i < len(a) and j < len(b):
    if a[i] < b[j]:
        L.append(a[i])
        i += 1
    elif a[i] > b[j]:
        L.append(b[j])
        j += 1
    else:
        i += 1
        j += 1

while i < len(a):
    L.append(a[i])
    i += 1

while j < len(b):
    L.append(b[j])
    j += 1

print(L)

Answer 4

如果补语中元素的顺序无关紧要，那么只需要collections.Counter：

from collections import Counter

a = [1, 2, 3, 2]
b = [1, 2]

complement = list((Counter(a) - Counter(b)).elements())  # complement = [2, 3]

如果补码中的项目顺序应与原始列表中的顺序相同，则使用以下内容：

from collections import Counter, defaultdict
from itertools import count

a = [1,2,3,2]
b = [2,1]

c = Counter(b)
d = defaultdict(count)

complement = [x for x in a if next(d[x]) >= c[x]]  # complement = [3, 2]

Answer 5

主要思想：如果值不唯一，请将它们设为唯一

def add_duplicate_position(items):
    element_counter = {}
    for item in items:
        element_counter[item] = element_counter.setdefault(item,-1) + 1
        yield element_counter[item], item

assert list(add_duplicate_position([1, 2, 2, 3])) == [(0, 1), (0, 2), (1, 2), (0, 3)]

def create_complementary_list_with_duplicates(a,b):
    a = list(add_duplicate_position(a))
    b = set(add_duplicate_position(b))
    return [item for _,item in [x for x in a if x not in b]]

a = [1, 2, 2, 3]
b = [1, 2]
assert create_complementary_list_with_duplicates(a,b) == [2, 3]

创建保留重复值的列表的补充

5 个答案:

主要思想：如果值不唯一，请将它们设为唯一