panads combine_first但始终覆盖

时间:2016-02-02 03:23:14

标签: python pandas merge

可以改进以下内容吗?

它实现了将值从df2复制到df1的期望结果,其中索引可以匹配。这似乎效率低下而且笨重。

df1 = pd.DataFrame([[0, 1, 2], [3, 4, 5]], index=pd.MultiIndex.from_tuples(['AB', 'AC']), columns=['X', 'Y', 'Z'])
df2 = pd.DataFrame([102, 103], index=pd.MultiIndex.from_tuples(['AC', 'AD']), columns=['Y'])
desired = df2.combine_first(df1).combine_first(df2)

print(df1)
print(df2)
print(desired)

Output:

df1
     X  Y  Z
A B  0  1  2
  C  3  4  5

df2
       Y
A C  102
  D  103

desired
      X    Y   Z
A B   0    1   2
  C   3  102   5
  D NaN  103 NaN

我最接近使用切片的是

print(df1.loc[df2.index, df2.columns])  # This works, demonstrated lhs of below is OK
df1.loc[df2.index, df2.columns] = df2   # This fails, as does df2.values

1 个答案:

答案 0 :(得分:0)

为什么不使用merge

>>df3 = pd.merge(df1, df2, left_index=True, right_index=True, how='outer')
>>df3
       X  Y_x    Z    Y_y
A B  0.0  1.0  2.0    NaN
  C  3.0  4.0  5.0  102.0
  D  NaN  NaN  NaN  103.0
>>df3['Y'] = df3['Y_y'].combine_first(df3['Y_x'])
>>df3.drop(['Y_x', 'Y_y'], axis=1)
       X    Z      Y
A B  0.0  2.0    1.0
  C  3.0  5.0  102.0
  D  NaN  NaN  103.0