Question

我能描述我想要实现的最好方法是，参考SQL函数INNER JOIN的工作方式来显示两个表中的数据，这些数据由匹配的列名确定。

我想实现类似的功能，尽管使用Python（最好是3.x），并且我希望将两个字典的整体结合在一起，而不是使用具有匹配列名的表。匹配{k：v}对。

例如...

lst_1 = [
    {
        'City'      :   'Boston',
        'State'     :   'Massechusets',
        'Name'      :   'Kim Tuttles',
        'Country'   :   'United State'
    },
    {
        'City'      :   'Portland',
        'Name'      :   'Larry Bird',
        'State'     :   'Oregon'
    },
    {
        'City'      :   'Chicago',
        'Name'      :   'John Jacobs',
        'State'     :   'Illinois'
    }
]

lst_2 = [
    {
        'Hobby'     :   'Tennis',
        'Build'     :   'Athletic',
        'Height'    :   'Six Feet, One Inch',
        'Name'      :   'Kim Tuttles',
        'Birthplace':   'Italy'
    },
    {
        'Name'      :   'John Jacobs',
        'Hobby'     :   'Baseball',
        'Build'     :   'Muscular',
        'Height'    :   'Five Feet, Eight Inches'
    }
]

我想找到一种方法来合并每个列表中的字典，但是仅在找到匹配的{Key：Value}对的情况下。结果看起来像这样...

merged_lst = [
    {
        'Hobby'     :   'Tennis',
        'Build'     :   'Athletic',
        'Height'    :   'Six Feet, One Inch',
        'Birthplace':   'Italy'
        'City'      :   'Boston',
        'State'     :   'Massechusets',
        'Name'      :   'Kim Tuttles', # Merge on matching name
        'Country'   :   'United State'
    },
    {
        'Name'      :   'John Jacobs', # Merge on matching name
        'Hobby'     :   'Baseball',
        'Build'     :   'Muscular',
        'Height'    :   'Five Feet, Eight Inches'
        'City'      :   'Chicago',
        'State'     :   'Illinois'
    }
]

我设法找到了一种使用dict.update和zip()合并字典的方法，尽管那只是在处理两个独立的字典时，仍然不太正确。感谢您的建议，并先谢谢您。

Answer 1

在Python 3.5+中，我们可以忽略以下内容，而忽略了其他按键冲突带来的问题。

k = 'Name'
merged_lst = [{**a, **b} for a in lst_1 for b in lst_2 if a[k]==b[k]]

{**a, **b}是将两个字典分解成一个组合字典的一种好方法（我相信在冲突中，它使用b中的值而不是a）。这是唯一需要3.5+的步骤。在带有 string 键的Python 2.x中，类似的结构是dict(a, **b)，尽管Guido对此并不满意。其他选项更加冗长。
使用Python列表推导，您可以通过两次使用lst_1来轻松遍历lst_2和for的笛卡尔积。
我们只关心'Name'位相同的字典，因此也只关心a[k]==b[k]位。
如果允许您破坏lst_1或lst_2中的任何词典，则涉及dict.update()的方法可能会更快。他们可能还是可以的，尽管我不认为语法不是很好。

Answer 2

您可以执行以下操作：

for l2 in lst_2:
   l2.update(next(l1 for l1 in lst_1 if l1["Name"] == l2["Name"]))

Answer 3

您可以创建一个函数来过滤重复名称上的一个列表，将字典转移到结果列表中并在那里进行更新：

功能：

def mergeSameNameDicts(l1,l2):

    duplicateNames = set ( p["Name"] for p in l1) & set( p["Name"] for p in l2) 

    import copy 
    rv = []        # collects enriched dicts
    for d in l1:
        if d["Name"] in duplicateNames:
            rv.append(copy.copy(d))           # copy dict over from l1

    for d in l2:                              # enhance with data from l2
        if (d["Name"] in duplicateNames):     # if name is a dupe. enhence all
            for d1 in rv:                     # dicts with that name inside rv
                if (d["Name"] == d1["Name"]): # the values of v2 will overwrite l1 if keys
                    d1.update(d)              # present in dicts of l1 and l2
    return rv

print(mergeSameNameDicts(lst_1,lst_2))

输出：

[{'City': 'Boston',
  'State': 'Massechusets',
  'Name': 'Kim Tuttles',
  'Country': 'United State',
  'Hobby': 'Tennis',
  'Build': 'Athletic',
  'Height': 'Six Feett, One Inch',
  'Birthplace': 'Italy'},

 {'City': 'Chicago',
  'Name': 'John Jacobs',
  'State': 'Illinois',
  'Hobby': 'Baseball',
  'Build': 'Muscular',
  'Height': 'Five Feet, Eight Inches'}]

Answer 4

这类似于RDMS中的left join（例如MySQL）和MongoDB的$lookup(aggregation)功能。您可以对其进行进一步的澄清。

如何将一个嵌套字典与另一个嵌套字典结合在一起，但前提是每个嵌套字典都具有匹配值？

4 个答案: