熊猫:基于包含字典的公共列合并2个数据框

时间:2019-11-14 14:02:39

标签: python pandas dictionary

如何基于具有不同字典的公共列比较和合并两个数据框?

我有以下两个数据框,

df1 = pd.DataFrame({'name':['tom','keith','sam','joe'],'assets':[{'laptop':1,'scanner':2},{'laptop':1,'printer':3}, {'car':12,'keys':34},{'power-cables':24}]})

df2 = pd.DataFrame({'place':['ca','bal-vm'],'default_assets':[{'laptop':4,'printer':3,'scanner':2,'bag':8},{'car':12,'keys':34,'mat':24,'holder':45}]})


df1:
   name   assets
0  tom    {'laptop':1,'scanner':2}
1  keith  {'laptop':1,'printer':3}
2  sam    {'car':12,'keys':34}
3  joe    {'power-cables':24}

df2:
   place  default_assets
0  ca     {'laptop':4,'printer':3,'scanner':2,'bag':8}
1  bal-vm {'car':12,'keys':34,'mat':24,'holder':45}
df2的所有键都位于df1中时,

df1.assets应该与df2.default_assets合并,否则应该填充None

因此,结果df应该是

    df:
       name    place    assets                    default_assets
    0  tom     ca       {'laptop':1,'scanner':2}  {'laptop':4,'printer':3,'scanner':2,'bag':8}
    1  keith   ca       {'laptop':1,'printer':3}  {'laptop':4,'printer':3,'scanner':2,'bag':8}
    2  sam     bal-vm   {'car':12,'keys':34}      {'car':12,'keys':34,'mat':24,'holder':45} 
    3  joe     None     {'power-cables':24}       None

1 个答案:

答案 0 :(得分:2)

您可以执行以下操作:

  1. df1每行与df2的交叉连接(叉积)
  2. 然后过滤掉df1.assets的所有键都不在df2.default_assets中的行。
  3. 使用pandas.concat添加df1中过滤出的行。

例如:

# cross join
merged = df1.assign(key=1).merge(df2.assign(key=1), on='key').drop('key', axis=1)

# mask to filter
mask = [asset.keys() < default.keys() for asset, default in zip(merged['assets'], merged['default_assets'])]

# add those not in the mask
result = pd.concat([merged.loc[mask], df1], sort=True).drop_duplicates('name')

# print in full
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    print(result)

输出

                        assets  \
0  {'laptop': 1, 'scanner': 2}   
2  {'laptop': 1, 'printer': 3}   
5      {'car': 12, 'keys': 34}   
3         {'power-cables': 24}   

                                      default_assets   name   place  
0  {'laptop': 4, 'printer': 3, 'scanner': 2, 'bag...    tom      ca  
2  {'laptop': 4, 'printer': 3, 'scanner': 2, 'bag...  keith      ca  
5   {'car': 12, 'keys': 34, 'mat': 24, 'holder': 45}    sam  bal-vm  
3                                                NaN    joe     NaN