将pandas多索引数据帧插入特定位置的另一个多索引数据框

时间:2017-02-03 10:16:22

标签: python pandas dataframe merge multi-index

我有一个parent_df和一个child_df,如下所示。

parent_df:
x  y  colA
x1 y1 A1
x1 y2 A2
x2 y1 A3
x2 y2 A4

child_df:
p  q  colB colC
p1 q1 B1   C1
p1 q2 B2   C2
p2 q1 B3   C3
p2 q2 B4   C4

我想修改parent_df或通过将child_df放入parent_df(x2,y1)中特定行的parent_df来创建新的parent_df,以便:

parent_df:
x  y  p  q  colA colB colC
x1 y1       A1   NA   NA
x1 y2       A2   NA   NA
x2 y1 p1 q1 A3   B1   C1
      p1 q2 A3   B2   C2
      p2 q1 A3   B3   C3
      p2 q2 A3   B4   C4
x2 y2       A4   NA   NA

有办法做到这一点吗?

1 个答案:

答案 0 :(得分:1)

我认为merge需要sort_index

print (parent_df)
      colA
x  y      
x1 y1   A1
   y2   A2
x2 y1   A3
   y2   A4

print (child_df)
      colB colC
p  q           
p1 q1   B1   C1
   q2   B2   C2
p2 q1   B3   C3
   q2   B4   C4

#create new columns
child_df['x'] =  'x2'
child_df['y'] =  'y1'
#set index by new columns
child_df = child_df.reset_index().set_index(['x','y'])
print (child_df)
        p   q colB colC
x  y                   
x2 y1  p1  q1   B1   C1
   y1  p1  q2   B2   C2
   y1  p2  q1   B3   C3
   y1  p2  q2   B4   C4

df = pd.merge(parent_df, child_df, left_index=True, right_index=True, how='outer')
#replace NaN in p. q columns with '', append and sort index
df = df.fillna({'p':'','q':''}).set_index(['p','q'], append=True).sort_index()
print (df)
            colA colB colC
x  y  p  q                
x1 y1         A1  NaN  NaN
   y2         A2  NaN  NaN
x2 y1 p1 q1   A3   B1   C1
         q2   A3   B2   C2
      p2 q1   A3   B3   C3
         q2   A3   B4   C4
   y2         A4  NaN  NaN