如何将多个数据帧合并到一个多索引数据帧,其中每个合并的数据帧都成为第二级列

时间:2017-09-08 06:37:45

标签: python pandas dataframe merge multi-index

我想要实现的目标(没有太多改组)是合并3个不同的数据帧,每个数据帧具有相同的列和索引,但每个数据帧代表不同的类别。

DF1

                                    Children    Movie enthusiast
household       
06f32e6e45da385834dac983256d59f3    0.086158    NaN
0d1974107c6731989c762e96def73568    0.120285    0.187764
0fd4f3b4adf43682f08e693a905b7432    0.400000    0.114686
11e0057cdc8b8e1b1cdabfa8a092ea5f    NaN         0.140000
120549af6977623bd01d77135a91a523    0.335238    0.192578

DF2

                                    Children    Movie enthusiast
household       
06f32e6e45da385834dac983256d59f3    1.0         0.0
0d1974107c6731989c762e96def73568    4.0         11.0
0fd4f3b4adf43682f08e693a905b7432    1.0         5.0
11e0057cdc8b8e1b1cdabfa8a092ea5f    0.0         2.0
120549af6977623bd01d77135a91a523    7.0         9.0

DF3

                                    Children    Movie enthusiast
household       
06f32e6e45da385834dac983256d59f3    nan         nan
0d1974107c6731989c762e96def73568    0.138       0.037
0fd4f3b4adf43682f08e693a905b7432    nan         0.025
11e0057cdc8b8e1b1cdabfa8a092ea5f    nan         0.153
120549af6977623bd01d77135a91a523    0.091       0.021

df_merged(手工填充,因此并非所有值都存在,但您明白了)

                                    Children                Movie enthusiast
                                    df1      df2    df3     df1      df2    df3
household                       
06f32e6e45da385834dac983256d59f3    0.086158 1      NaN     NaN      NaN    NaN
0d1974107c6731989c762e96def73568    0.120285 4      0.138   0.187764 NaN    NaN
0fd4f3b4adf43682f08e693a905b7432    0.400000 1      NaN     0.114686 NaN    NaN
11e0057cdc8b8e1b1cdabfa8a092ea5f    NaN      0      NaN     0.140000 NaN    NaN
120549af6977623bd01d77135a91a523    0.335238 7      0.091   0.192578 NaN    NaN

1 个答案:

答案 0 :(得分:0)

我认为您需要concat使用参数keys,然后swaplevel使用sort_index获取所需格式的MultiIndex

df = pd.concat([df1, df2, df3], keys=['df1','df2','df3'], axis=1)
       .swaplevel(0,1,axis=1)
       .sort_index(axis=1)
print (df)

                                  Children             Movie enthusiast  \
                                       df1  df2    df3              df1   
06f32e6e45da385834dac983256d59f3  0.086158  1.0    NaN              NaN   
0d1974107c6731989c762e96def73568  0.120285  4.0  0.138         0.187764   
0fd4f3b4adf43682f08e693a905b7432  0.400000  1.0    NaN         0.114686   
11e0057cdc8b8e1b1cdabfa8a092ea5f       NaN  0.0    NaN         0.140000   
120549af6977623bd01d77135a91a523  0.335238  7.0  0.091         0.192578   
household                              NaN  NaN    NaN              NaN   

                                        Movie enthusiastnthusiast  
                                    df3                       df2  
06f32e6e45da385834dac983256d59f3    NaN                      0.00  
0d1974107c6731989c762e96def73568  0.037                     11.00  
0fd4f3b4adf43682f08e693a905b7432  0.025                      5.00  
11e0057cdc8b8e1b1cdabfa8a092ea5f  0.153                      2.00  
120549af6977623bd01d77135a91a523  0.021                      9.01  
household                           NaN                       NaN