Question

我想使用多个键合并两个词典列表。

我有一个单词列表，其中包含一组结果：

l1 = [{'id': 1, 'year': '2017', 'resultA': 2},
      {'id': 2, 'year': '2017', 'resultA': 3},
      {'id': 1, 'year': '2018', 'resultA': 3},
      {'id': 2, 'year': '2018', 'resultA': 5}]

另一组结果的另一个结果列表：

l2 = [{'id': 1, 'year': '2017', 'resultB': 5},
      {'id': 2, 'year': '2017', 'resultB': 8},
      {'id': 1, 'year': '2018', 'resultB': 7},
      {'id': 2, 'year': '2018', 'resultB': 9}]

我希望使用＆＃39; id＆＃39;和＆＃39;年＆＃39;获得以下内容的关键：

all = [{'id': 1, 'year': '2017', 'resultA': 2, 'resultB': 5},
       {'id': 2, 'year': '2017', 'resultA': 3, 'resultB': 8},
       {'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 7},
       {'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 9}]

我知道，为了在一个密钥上组合两个dicts列表，我可以使用它：

l1 = {d['id']:d for d in l1} 

all = [dict(d, **l1.get(d['id'], {})) for d in l2]

但它忽略了这一年，提供了以下不正确的结果：

all = [{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 5},
       {'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 8},
       {'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 7},
       {'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 9}]

像在R中那样对待这个，通过添加我要合并的第二个变量，我得到一个KeyError：

l1 = {d['id','year']:d for d in l1} 

all = [dict(d, **l1.get(d['id','year'], {})) for d in l2]

如何使用多个键进行合并？

Answer 1

而不是d['id','year']，请使用元组(d['id'], d['year'])作为密钥。

Answer 2

您可以在id和year上的结果列表中同时合并列表和分组。然后将具有相同键的dict合并在一起。

使用itertools.groupby可以实现分组，可以使用collection.ChainMap进行合并

>>> from itertools import groupby
>>> from collections import ChainMap

>>> [dict(ChainMap(*list(g))) for _,g in groupby(sorted(l1+l2, key=lambda x: (x['id'],x['year'])),key=lambda x: (x['id'],x['year']))]
>>> [{'resultA': 2, 'id': 1, 'resultB': 5, 'year': '2017'}, {'resultA': 3, 'id': 1, 'resultB': 7, 'year': '2018'}, {'resultA': 3, 'id': 2, 'resultB': 8, 'year': '2017'}, {'resultA': 5, 'id': 2, 'resultB': 9, 'year': '2018'}]

另外，为避免lambda，您也可以使用operator.itemgetter

 >>> from operator import itemgetter
 >>> [dict(ChainMap(*list(g))) for _,g in groupby(sorted(l1+l2, key=itemgetter('id', 'year')),key=itemgetter('id', 'year'))]

Answer 3

展开@AlexHall's suggestion，您可以使用collections.defaultdict来帮助您：

from collections import defaultdict

d = defaultdict(dict)

for i in l1 + l2:
    results = {k: v for k, v in i.items() if k not in ('id', 'year')}
    d[(i['id'], i['year'])].update(results)

<强>结果

defaultdict(dict,
            {(1, '2017'): {'resultA': 2, 'resultB': 5},
             (1, '2018'): {'resultA': 3, 'resultB': 7},
             (2, '2017'): {'resultA': 3, 'resultB': 8},
             (2, '2018'): {'resultA': 5, 'resultB': 9}})

使用多个键合并python词典列表

3 个答案: