通过字典键和公共列组合熊猫数据帧的多个字典

时间:2020-10-12 20:15:08

标签: python python-3.x pandas dictionary

我有四个通过按组划分四个数据帧的字典创建的字典,现在我需要使用键和公共列作为连接条件将每个字典的数据帧连接到一个新字典中。

例如:

import pandas as pd
from functools import reduce


df_1 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'count': [10, 20, 30], 'colors': ['red', 'white', 'blue']})

df_2 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'time': [1.3, 2.5, 3]})

df_3 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'order_num': [2, 4, 7]})

df_4 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'result': ['g','b','b']})

dict1= dict(tuple(df_1.groupby('Group')))
dict2= dict(tuple(df_2.groupby('Group')))
dict3= dict(tuple(df_3.groupby('Group')))
dict4= dict(tuple(df_4.groupby('Group')))

使用手动解决方案所需的结果:

datA=[dict1['A'],dict2['A'],dict3['A'],dict4['A']]
datB=[dict1['B'],dict2['B'],dict3['B'],dict4['B']]
datC=[dict1['C'],dict2['C'],dict3['C'],dict4['C']]

final_dict = {'A' : reduce(lambda left,right: pd.merge(left,right,on=['Group','ID']), datA),
            'B' : reduce(lambda left,right: pd.merge(left,right,on=['Group','ID']), datB),
            'C' : reduce(lambda left,right: pd.merge(left,right,on=['Group','ID']), datC)}

在寻找可扩展的非手动解决方案方面的任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:0)

这足够动态吗?

# Put all your dicts into a dict of dicts
dict_dict = {str(i):dict_i for i,dict_i in enumerate([dict1,dict2,dict3,dict4])}

# swap the order of the indices so groups are keys and the
# list of grouped dfs are the items
dat_dicts = {group_key:[df_dict[group_key] for df_dict in dict_dict.values()]
             for group_key in list(dict_dict.values())[0].keys()}

# Apply the reduce on each group key to merge the dfs
merged_dat_df_dict = {group_key:reduce(lambda left,right:
                                 pd.merge(left,right,on=['Group','ID']),
                                 dat_df_list)
                      for group_key,dat_df_list in dat_dicts.items()}