为同一索引连接2个具有不同行数的数据帧

时间:2018-01-31 16:03:31

标签: python pandas dataframe

嗨我有2个数据帧df1和df2。

df1=pd.DataFrame(data=[['a1','a2'],['b1','b2'],['c1','c2']],columns=['HR','RR'],index=[0,0,1])

df1
Out[146]: 
   HR  RR
0  a1  a2
0  b1  b2
1  c1  c2

df2=pd.DataFrame(data=[['1','2'],['1','2'],['1','2'],['1','2'],['1','2']],columns=['ST','SR'],index=[0,0,0,0,1])

df2
Out[147]: 
  ST SR
0  1  2
0  1  2
0  1  2
0  1  2
1  1  2

如何连接它们以获得结果和结果2?

result = pd.DataFrame(data=[['a1','a2',1,2],['b1','b2',1,2],[np.nan,np.nan,1,2],[np.nan,np.nan,1,2],['c1','c2',1,2]],columns=['HR','RR','ST','SR'],index=[0,0,0,0,1])

result
Out[148]: 
    HR   RR  ST  SR
0   a1   a2   1   2
0   b1   b2   1   2
0  NaN  NaN   1   2
0  NaN  NaN   1   2
1   c1   c2   1   2

result2 = pd.DataFrame(data=[[np.nan,np.nan,1,2],[np.nan,np.nan,1,2],['a1','a2',1,2],['b1','b2',1,2],['c1','c2',1,2]],columns=['HR','RR','ST','SR'],index=[0,0,0,0,1])

result2
Out[148]: 
    HR   RR  ST  SR
0  NaN  NaN   1   2
0  NaN  NaN   1   2
0   a1   a2   1   2
0   b1   b2   1   2
1   c1   c2   1   2

1 个答案:

答案 0 :(得分:0)

追加另一级索引,表示当前索引的行号,然后是外连接:

df1.set_index(
    df1.groupby(level=0).cumcount(), 
    append=True
).join(
    df2.set_index(
        df2.groupby(level=0).cumcount(), 
        append=True
    ),
    how="outer"
).reset_index(level=1, drop=True)

#    HR RR  ST  SR
#0  a1  a2  1   2
#0  b1  b2  1   2
#0  NaN NaN 1   2
#0  NaN NaN 1   2
#1  c1  c2  1   2

要将NaN位置反转到顶部,您可以执行以下操作,即先将两个数据帧反转,然后在连接后将其反转:

df1 = df1.iloc[::-1]
df2 = df2.iloc[::-1]
df1.set_index(df1.groupby(level=0).cumcount(), append=True).join(
    df2.set_index(df2.groupby(level=0).cumcount(), append=True),
    how="outer"
).reset_index(level=1, drop=True).iloc[::-1].sort_index()


#    HR  RR  ST SR
#0  NaN NaN   1  2
#0  NaN NaN   1  2
#0   a1  a2   1  2
#0   b1  b2   1  2
#1   c1  c2   1  2