Question

我有两个这样的字典列表：

list1 =[{doc:1,pos_ini:5,pos_fin:10},{doc:1,pos_ini:7,pos_fin:12},{doc:2,pos_ini:5,pos_fin:10},**{doc:7,pos_ini:5,pos_fin:10}**]

list2 =
[{doc:1,pos_ini:5,pos_fin:10},**{doc:1,pos_ini:6,pos_fin:7}**,{doc:1,pos_ini:7,pos_fin:12},{doc:2,pos_ini:5,pos_fin:10},**{doc:2,pos_ini:25,pos_fin:30}**]

list2具有list1不具有的两个元素，而list1具有list2不具有的两个元素。

我需要合并所有元素的list_result：

list_result =[{doc:1,pos_ini:5,pos_fin:10},**{doc:1,pos_ini:6,pos_fin:7}**,{doc:1,pos_ini:7,pos_fin:12},{doc:2,pos_ini:5,pos_fin:10},
**{doc:2,pos_ini:25,pos_fin:30}**,**{doc:7,pos_ini:5,pos_fin:10}**]

用Python做到这一点的最佳方法是什么？谢谢！

Answer 1

在Python中，有一个内置的set集合非常适合此操作。问题在于集合需要hashable元素，因此您必须将字典转换为一组元组：

[dict(items) for items in set(tuple(sorted(d.items())) for d in (list1 + list2))]

Answer 2

您可以根据这些值创建一个集合，而不是将其转换为可哈希对象（如元组）的字典：

unique_list = set(tuple(dictionary.items())) for dictionary in list1 + list2)

，然后可以再次转换回字典并列出格式：

l = []
for item in unique_list:
    l.append(dict(item))

上面的东西应该起作用。

Answer 3

您可以使用frozenset()将每个字典items()散列到字典中，然后简单地获取分配的值：

list({frozenset(x.items()): x for x in list1 + list2}.values())

或使用map()应用于集合理解：

list(map(dict, {frozenset(x.items()) for x in list1 + list2}))

或者甚至仅使用列表理解：

[dict(d) for d in {frozenset(x.items()) for x in list1 + list2}]

哪个会给出无序结果：

[{'doc': 1, 'pos_fin': 10, 'pos_ini': 5},
 {'doc': 1, 'pos_fin': 12, 'pos_ini': 7},
 {'doc': 2, 'pos_fin': 10, 'pos_ini': 5},
 {'doc': 7, 'pos_fin': 10, 'pos_ini': 5},
 {'doc': 1, 'pos_fin': 7, 'pos_ini': 6},
 {'doc': 2, 'pos_fin': 30, 'pos_ini': 25}]

注意：如果需要 order ，则可以在此处使用collections.OrderedDict()来代替

from collections import OrderedDict

list(OrderedDict((frozenset(x.items()), x) for x in list1 + list2).values())

给出以下排序的结果：

[{'doc': 1, 'pos_fin': 10, 'pos_ini': 5},
 {'doc': 1, 'pos_fin': 12, 'pos_ini': 7},
 {'doc': 2, 'pos_fin': 10, 'pos_ini': 5},
 {'doc': 7, 'pos_fin': 10, 'pos_ini': 5},
 {'doc': 1, 'pos_fin': 7, 'pos_ini': 6},
 {'doc': 2, 'pos_fin': 30, 'pos_ini': 25}]

在Python中合并两个没有id的字典列表

3 个答案: