Python Pandas将两个多索引数据框连接成另一个多索引级别的数据框

时间:2018-01-16 05:21:56

标签: python pandas dataframe concatenation multi-index

我正在再次处理多指数。这次,我将使用相同类型的多索引索引和列来合并两个数据帧。但是,值不同,并且会有index.level(0)值不同。我想将两个数据帧合并为一个。请看下面的例子。

In [13]: df_ver1
Out[13]: 
key  nm         0         1         2         3
bar one -0.424972  0.567020  0.276232 -1.087401
    two -0.673690  0.113648 -1.478427  0.524988
baz one  0.404705  0.577046 -1.715002 -1.039268
    two -0.370647 -1.157892 -1.344312  0.844885
foo one  1.075770 -0.109050  1.643563 -1.469388
    two  0.357021 -0.674600 -1.776904 -0.968914
qux one -1.294524  0.413738  0.276662 -0.472035
    two -0.013960 -0.362543 -0.006154 -0.923061


In [14]: df_ver2
Out[14]: 
key  nm         0         1         2         3
bar one  0.895717  0.410835 -1.413681 -1.236269
    two  0.805244  0.813850  1.607920  0.896171
baz one -1.206412  0.132003  1.024180 -0.487602
    two  2.565646 -0.827317  0.569605 -0.082240
oof one  1.431256 -0.076467  0.875906 -2.182937
    two  1.340309 -1.187678 -2.211372  0.380396
qux one -1.170299  1.130127  0.974466  0.084844
    two -0.226169 -1.436737 -2.006747  0.432390


In [15]: df_total
out[15]:
key  nm  ver            0         1         2         3
bar one ver1    -0.424972  0.567020  0.276232 -1.087401
        ver2     0.895717  0.410835 -1.413681 -1.236269
    two ver1    -0.673690  0.113648 -1.478427  0.524988
        ver2     0.805244  0.813850  1.607920  0.896171
baz one ver1     0.404705  0.577046 -1.715002 -1.039268
        ver2    -1.206412  0.132003  1.024180 -0.487602
    two ver1    -0.370647 -1.157892 -1.344312  0.844885
        ver2     2.565646 -0.827317  0.569605 -0.082240
qux one ver1    -1.294524  0.413738  0.276662 -0.472035
        ver2    -1.170299  1.130127  0.974466  0.084844
    two ver1    -0.013960 -0.362543 -0.006154 -0.923061
        ver2    -0.226169 -1.436737 -2.006747  0.432390

正如您已经看到的,多级索引和第三级的两个级别将数据帧版本指示为ver1或ver2。每个值都是根据列进行的。需要记住的一点是,在df_ver1中,有foo索引,而在df_ver2中,有oof索引。这些索引没有与其他数据帧匹配的索引,因此它在内部框架中连接。我希望我解释得很好,如果你有任何问题,请告诉我。谢谢你的帮助!

1 个答案:

答案 0 :(得分:1)

取交叉点两个数据帧的索引,concat和sort_index即

idx = d1.index.intersection(d2.index)
one = pd.concat([d1.loc[idx],d2.loc[idx]],keys=['ver1','ver2'])

one.reset_index().set_index(['key','nm','level_0']).sort_index(level=['key','nm'])

                      0         1         2         3
key nm  level_0                                        
bar one ver1    -0.424972  0.567020  0.276232 -1.087401
        ver2     0.895717  0.410835 -1.413681 -1.236269
    two ver1    -0.673690  0.113648 -1.478427  0.524988
        ver2     0.805244  0.813850  1.607920  0.896171
baz one ver1     0.404705  0.577046 -1.715002 -1.039268
        ver2    -1.206412  0.132003  1.024180 -0.487602
    two ver1    -0.370647 -1.157892 -1.344312  0.844885
        ver2     2.565646 -0.827317  0.569605 -0.082240
qux one ver1    -1.294524  0.413738  0.276662 -0.472035
        ver2    -1.170299  1.130127  0.974466  0.084844
    two ver1    -0.013960 -0.362543 -0.006154 -0.923061
        ver2    -0.226169 -1.436737 -2.006747  0.432390