假设我有2个数据帧,df1和df2
alert()
假设他们的索引是subject_id,我该如何获得以下内容:
subject_id first_name last_name
1 Alex Anderson
2 Amy Ackerman
3 Allen Ali
4 Alice Aoni
5 Ayoung Atiches
subject_id first_name last_name
4 Billy Bonder
5 Brian Black
6 Bran Balwner
7 Bryce Brice
8 Betty Btisan
当我在这时,如何得到这个:
subject_id first_name last_name
1 Alex Anderson
2 Amy Ackerman
3 Allen Ali
4 Billy Bonder
5 Brian Black
6 Bran Balwner
7 Bryce Brice
8 Betty Btisan
答案 0 :(得分:2)
使用combine_first
,如有必要,请先使用set_index
:
df11 = df1.set_index('subject_id')
df22 = df2.set_index('subject_id')
df3 = df22.combine_first(df11).reset_index()
print (df3)
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
3 4 Billy Bonder
4 5 Brian Black
5 6 Bran Balwner
6 7 Bryce Brice
7 8 Betty Btisan
df3 = df11.combine_first(df22).reset_index()
print (df3)
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
3 4 Alice Aoni
4 5 Ayoung Atiches
5 6 Bran Balwner
6 7 Bryce Brice
7 8 Betty Btisan
答案 1 :(得分:1)
我们可以使用pd.concat
和drop_duplicates
,(抱歉似乎是这样,隐藏他们的格式......,这会让答案难看......)
pd.concat([df1,df2]).drop_duplicates('subject_id',keep='first')
Out[95]:
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
3 4 Alice Aoni
4 5 Ayoung Atiches
2 6 Bran Balwner
3 7 Bryce Brice
4 8 Betty Btisan
pd.concat([df1,df2]).drop_duplicates('subject_id',keep='last')
Out[96]:
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
0 4 Billy Bonder
1 5 Brian Black
2 6 Bran Balwner
3 7 Bryce Brice
4 8 Betty Btisan