我有三个看起来像这样的数据框:
SA_ID MTRCYCCD DA_JAN_2015
0 1234 T BUNDLED
1 5678 V BUNDLED
2 2345 D BUNDLED
3 7891 V BUNDLED
SA_ID MTRCYCCD DA_JAN_2016
0 1123 T BUNDLED
1 5678 V BUNDLED
2 4567 D BUNDLED
3 7891 V BUNDLED
SA_ID MTRCYCCD DA_JAN_2017
0 1123 T BUNDLED
1 5678 V BUNDLED
2 2123 D BUNDLED
3 1001 V DA
我想合并/分组/按任何方式制作单个数据框,如下所示:
SA_ID MTRCYCCD DA_JAN_2015 DA_JAN_2016 DA_JAN_2017
0 1234 T BUNDLED NaN NaN
1 5678 V BUNDLED BUNDLED BUNDLED
2 2345 D BUNDLED NaN NaN
3 7891 V BUNDLED BUNDLED NaN
4 1123 T NaN BUNDLED BUNDLED
5 4567 D NaN BUNDLED NaN
6 2123 D NaN NaN BUNDLED
7 1001 V NaN NaN DA
我尝试了一些事情,例如:
pd.merge(df1,df2,on=['SA_ID','MTRCYCCD'],how='outer').merge(df3,on=[SA_ID','MTRCYCCD'],how='outer)
和:
pd.merge(df1,df2,left_on='SA_ID',right_on='SA_ID',how='outer').merge(df3,on='SA_ID',how='outer)
我认为groupby是我所需要的,但是我不确定如何到达想要去的地方。有什么想法吗?