连接不均匀的Pandas Multiindices

时间:2016-04-23 19:09:49

标签: python pandas

import pandas as pd

d1 = {'A': ['a'],
      'B1': ['b1'],
      'C1': ['c1']}

d2 = {'A': ['a'],
      'B2': ['b2'],
      'C2': ['c2']}

df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)

df1.set_index(['A', 'B1'], inplace=True)
df2.set_index(['A', 'B2'], inplace=True)

df = pd.concat([df1, df2], axis=0)

print(df)

我得到了输出:

       C1   C2
A B1          
a b1   c1  NaN
  b2  NaN   c2

但是,我想

                 C1    C2  
A   B1   B2              
a   b1   NaN     c1    NaN
a   NaN  b2      NaN   c2 

在Pandas中连接多重指示的规则是什么?

如何获得理想的结果?

1 个答案:

答案 0 :(得分:1)

更新:处理重复的列:

In [39]: pd.concat([df1.reset_index(),df2.reset_index()])\
   ....:   .set_index(pd.unique(df1.index.names + df2.index.names).tolist())
Out[39]:
            C1   C2
A B1  B2
a b1  NaN   c1  NaN
  NaN b2   NaN   c2

OLD回答:

In [259]: pd.concat([df1.reset_index(), df2.reset_index()]).set_index(df1.index.names + df2.index.names)
Out[259]:
                   C    F
A   B   D   E
a   b   NaN NaN    c  NaN
NaN NaN d   e    NaN    f

或者,您可以尝试merge(),假设df1 为空:

df1.reset_index().merge(df2.reset_index(), left_index=True, right_index=True, how='left').set_index(df1.index.names + df2.index.names)