将一个Dataframe替换为另一个Dataframe,但保留原始

时间:2017-05-19 15:58:44

标签: python pandas numpy

我有两个像这样的数据框

  DF1 

  name        age    surname   previous_surname 
  Andrese     20     William   William
  Jancy       25     Thomas    Thomas
  Andronella  22     Harry     Harry
  Amelia      21     Jack      Jack

  DF2

  name        age    surname   
  Andrese     20     Harrison   
  Jancy       25     James   
  Jessica     22     Litpick
  Amelia      21     -
  1. 我想在名称和年龄
  2. 上用DF2替换DF1
  3. 我想保留DF2中缺少但在DF1上显示的任何记录
  4. 我想添加DF1不包含但DF2到DF1中包含的任何内容。基本上我想要一个全包式DF,看起来像这样。

      name        age    surname   previous_surname 
      Andrese     20     Harrison   William
      Jancy       25     James      Thomas
      Andronella  22     Harry      Harry
      Amelia      21     Jack       Jack
      Jessica     22     Litpick    - 
    

1 个答案:

答案 0 :(得分:0)

您需要与combine_first合并

df =pd.merge(df1,df2, on=['name', 'age'], how = 'outer').replace({'-': np.nan})
df['surname']=df['surname_y'].combine_first(df['surname_x'])
df = df.drop(['surname_x', 'surname_y'], axis = 1)

    name        age previous_surname    surname
0   Andrese     20  William             Harrison
1   Jancy       25  Thomas              James
2   Andronella  22  Harry               Harry
3   Amelia      21  Jack                Jack
4   Jessica     22  NaN                 Litpick