我有两个列表,每个列表都有非唯一的数字,这意味着它们可以多次具有相同的值。
我需要找到两者之间的差异,考虑到相同的值可能出现多次的事实(所以我不能区分每个的集合)。因此,我需要检查第一个列表中的值是否比第二个列出现的次数多。
列表是:
l1 = [1, 2, 5, 3, 3, 4, 9, 8, 2]
l2 = [1, 1, 3, 2, 4, 8, 9]
# Sorted and justified
l1 = [1, 2, 2, 3, 3, 4, 5, 8, 9]
l2 = [1, 1, 2, 3, 4, 8, 9]
list元素可以是string或int或float。 所以结果列表应该是:
difference(l1, l2) == [3, 5, 2]
# There is an extra 2 and 3 in l1 that is not in l2, and a 5 in l1 but not l2.
difference(l2, l1) == [1]
# The extra 1 is the only value in l2 but not in l1.
我尝试了列表理解[x for x in l1 if x not in l2]
这不起作用,因为它没有考虑两者中的重复值。
答案 0 :(得分:4)
如果订单不重要,您可以使用Counter
(请参阅标准库的collections模块):
from collections import Counter
l1 = [1,2,5,3,3,4,9,8,2]
l2 = [1,1,3,2,4,8,9]
c1 = Counter(l1) # Counter({2: 2, 3: 2, 1: 1, 5: 1, 4: 1, 9: 1, 8: 1})
c2 = Counter(l2) # Counter({1: 2, 3: 1, 2: 1, 4: 1, 8: 1, 9: 1})
diff1 = list((c1-c2).keys()) # [2, 5, 3]
diff2 = list((c2-c1).keys()) # [1]
这是相当普遍的,也适用于字符串:
...
l1 = ['foo', 'foo', 'bar']
l2 = ['foo', 'bar', 'bar', 'baz']
...
# diff1 == ['foo']
# diff2 == ['bar', 'baz']
答案 1 :(得分:2)
我觉得很多人会来这里寻求多方差异(例如:[1, 1, 1, 2, 2, 2, 3, 3] - [1, 2, 2] == [1, 1, 2, 3, 3]
),所以我也会在这里发布答案:
import collections
def multiset_difference(a, b):
"""Compute a - b of two multisets a and b"""
a = collections.Counter(a)
b = collections.Counter(b)
difference = a - b
return difference # Remove this line if you want it as a list
as_list = []
for item, count in difference.items():
as_list.extend([item] * count)
return as_list
def ordered_multiset_difference(a, b):
"""As above, but preserves order and is O(ab) worst case"""
difference = list(a)
depleted = set() # Values that aren't in difference to prevent searching the list again
for i in b:
if i not in depleted:
try:
difference.remove(i)
except ValueError:
depleted.add(i)
return difference
答案 2 :(得分:0)
使用Counter
可能是更好的选择,但要自己动手:
def diff(a, b):
result = []
cpy = b[:]
for ele in a:
if ele in cpy:
cpy.remove(ele)
else:
result.append(ele)
return result
或作为一个滥用的单行:
def diff(a, b):
return [ele for ele in a if ele not in b or b.remove(ele)]
单个衬垫会在构建差异的过程中销毁b
,因此您可能需要传递副本:diff(l1, l2[:])
,或使用:
def diff(a, b):
cpy = b[:]
return [ele for ele in a if ele not in cpy or cpy.remove(ele)]