我有两个像这样的数据框
DF1
name age surname previous_surname
Andrese 20 William William
Jancy 25 Thomas Thomas
Andronella 22 Harry Harry
Amelia 21 Jack Jack
DF2
name age surname
Andrese 20 Harrison
Jancy 25 James
Jessica 22 Litpick
Amelia 21 -
我想添加DF1不包含但DF2到DF1中包含的任何内容。基本上我想要一个全包式DF,看起来像这样。
name age surname previous_surname
Andrese 20 Harrison William
Jancy 25 James Thomas
Andronella 22 Harry Harry
Amelia 21 Jack Jack
Jessica 22 Litpick -
答案 0 :(得分:0)
您需要与combine_first合并
df =pd.merge(df1,df2, on=['name', 'age'], how = 'outer').replace({'-': np.nan})
df['surname']=df['surname_y'].combine_first(df['surname_x'])
df = df.drop(['surname_x', 'surname_y'], axis = 1)
name age previous_surname surname
0 Andrese 20 William Harrison
1 Jancy 25 Thomas James
2 Andronella 22 Harry Harry
3 Amelia 21 Jack Jack
4 Jessica 22 NaN Litpick