如何在从具有相同索引的第一个数据帧中删除行时合并pandas数据帧?

时间:2017-11-23 14:46:52

标签: python pandas join dataframe merge

假设我有2个数据帧,df1和df2

alert()

假设他们的索引是subject_id,我该如何获得以下内容:

subject_id first_name last_name        
1                Alex  Anderson
2                 Amy  Ackerman
3               Allen       Ali
4               Alice      Aoni
5              Ayoung   Atiches

subject_id first_name last_name
4               Billy    Bonder
5               Brian     Black
6                Bran   Balwner
7               Bryce     Brice
8               Betty    Btisan

当我在这时,如何得到这个:

subject_id first_name last_name        
1                Alex  Anderson
2                 Amy  Ackerman
3               Allen       Ali
4               Billy    Bonder
5               Brian     Black
6                Bran   Balwner
7               Bryce     Brice
8               Betty    Btisan

2 个答案:

答案 0 :(得分:2)

使用combine_first,如有必要,请先使用set_index

df11 = df1.set_index('subject_id')
df22 = df2.set_index('subject_id')

df3 = df22.combine_first(df11).reset_index()
print (df3)
   subject_id first_name last_name
0           1       Alex  Anderson
1           2        Amy  Ackerman
2           3      Allen       Ali
3           4      Billy    Bonder
4           5      Brian     Black
5           6       Bran   Balwner
6           7      Bryce     Brice
7           8      Betty    Btisan

df3 = df11.combine_first(df22).reset_index()
print (df3)
   subject_id first_name last_name
0           1       Alex  Anderson
1           2        Amy  Ackerman
2           3      Allen       Ali
3           4      Alice      Aoni
4           5     Ayoung   Atiches
5           6       Bran   Balwner
6           7      Bryce     Brice
7           8      Betty    Btisan

答案 1 :(得分:1)

我们可以使用pd.concatdrop_duplicates,(抱歉似乎是这样,隐藏他们的格式......,这会让答案难看......)

pd.concat([df1,df2]).drop_duplicates('subject_id',keep='first')

Out[95]: subject_id first_name last_name 0 1 Alex Anderson 1 2 Amy Ackerman 2 3 Allen Ali 3 4 Alice Aoni 4 5 Ayoung Atiches 2 6 Bran Balwner 3 7 Bryce Brice 4 8 Betty Btisan pd.concat([df1,df2]).drop_duplicates('subject_id',keep='last')

Out[96]: subject_id first_name last_name 0 1 Alex Anderson 1 2 Amy Ackerman 2 3 Allen Ali 0 4 Billy Bonder 1 5 Brian Black 2 6 Bran Balwner 3 7 Bryce Brice 4 8 Betty Btisan