如何合并具有重复行的两个数据框?

时间:2020-04-24 15:01:27

标签: python-3.x pandas merge

我有两个数据帧df1df2df1在第name栏重复了文字,但第hobby栏改变了。 df2name列中也有重复的文本。我想合并两个数据框并保留所有内容。

df1:
name   hobby

mike   cricket 
mike   football
jack   chess
jack   football
jack   vollyball
pieter sleeping
pieter cyclying

我的df2

df2:
name

mike
pieter 
jack  
mike
pieter 

现在我必须在df2列上将df1name合并 所以我的结果df3应该看起来像这样:

df3:
name   hobby

mike   cricket 
mike   football
pieter sleeping
pieter cyclying
jack   chess
jack   football
jack   vollyball
mike   cricket 
mike   football
pieter sleeping
pieter cyclying


1 个答案:

答案 0 :(得分:1)

IIUC,您想为df2分配一个订单,在name上合并,然后按所述顺序排序:

(df2.assign(rank=np.arange(len(df2)))
    .merge(df1, on='name')
    .sort_values('rank')
    .drop('rank', axis=1)
)

输出:

      name      hobby
0     mike    cricket
1     mike   football
4   pieter   sleeping
5   pieter   cyclying
8     jack      chess
9     jack   football
10    jack  vollyball
2     mike    cricket
3     mike   football
6   pieter   sleeping
7   pieter   cyclying