在Pandas DataFrame中如何合并/连接两个DataFrame,它包含左表中的所有行和右侧DataFrame中的重复值

时间:2017-03-08 19:47:53

标签: python pandas dataframe

如果df1看起来像:

Build_ID, Request_ID, Group_ID, Average
185, 100, G1, 200
186, 100, G1, 201
185, 102, G1, 203
186, 102, G1, 205
185, 200, G3, 200
186, 200, G3, 201
185, 202, G3, 203
186, 202, G3, 205

和df2看起来像:

Build_ID, Group_ID, Group_Average
185, G1, 300
186, G1, 301
185, G3, 401
186, G3, 402

最终结果如下:

 Build_ID, Request_ID, Group_ID, Average, Group_Average
    185, 100, G1, 200, 300
    186, 100, G1, 201, 301
    185, 102, G1, 203, 300
    186, 102, G1, 205, 301
    185, 200, G3, 200, 401
    186, 200, G3, 201, 402
    185, 202, G3, 203, 401
    186, 202, G3, 205, 402

对于每个Group_ID和Build_ID,基本上包含来自df1的所有行和来自df2的Group_Average。 我尝试使用不同的关节进行合并和连接,但无法获得我正在寻找的结果。感谢

1 个答案:

答案 0 :(得分:0)

这就是你想要的吗?

In [60]: df1.merge(df2, on=['Build_ID','Group_ID'])
Out[60]:
   Build_ID  Request_ID Group_ID  Average  Group_Average
0       185         100       G1      200            300
1       185         102       G1      203            300
2       186         100       G1      201            301
3       186         102       G1      205            301
4       185         200       G3      200            401
5       185         202       G3      203            401
6       186         200       G3      201            402
7       186         202       G3      205            402