如何在pandas中合并重复值的两个数据帧

时间:2015-12-02 18:12:48

标签: python pandas dataframe

我在pandas中有两个数据帧:

  dilevery_time   dispatch_time  source_lat  source_long  Address   name
0 21:39:37.265    21:47:37.265   -73.955741    40.3422     Dmart    John
0 21:39:37.265    21:47:37.265   -73.955741    40.3422     Dmart    John

另一个是:

  chef_name   dish_name   dish_price   dish_quantity   ratings
0   xyz        Chicken      120            1             4
1   abc        Paneer       100            2             3 

我想在pandas中加入这两个数据帧。我已经执行了连接,但它不允许我执行,因为第一个数据帧有重复的值。

所以,我这样做了:

pd.concat([df1, df2], join='inner', axis=1)

但是这给了我以下输出:

   dilevery_time  dispatch_time  source_long   Address  name  chef_name  
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   xyz
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   xyz

  dish_name   dish_price    dish_quantity    ratings
0  Chicken      120             1                4
0  Chicken      120             1                4

我想用这种格式:

   dilevery_time  dispatch_time  source_long   Address  name  chef_name  
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   xyz
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   abc

  dish_name   dish_price    dish_quantity    ratings
0  Chicken      120             1                4
0  Paneer       100             2                3

如何在熊猫中做到这一点?

1 个答案:

答案 0 :(得分:0)

这是因为在第一个数据帧中你有两次索引0。您可以使用reset_index方法,然后获得结果:

In [9]: df
Out[9]: 
  chef_name dish_name  dish_price  dish_quantity  ratings
0       xyz   Chicken         120              1        4
1       abc    Paneer         100              2        3

In [10]: df1
Out[10]: 
  chef_name dish_name  dish_price  dish_quantity  ratings
0       xyz   Chicken         120              1        4
1       abc    Paneer         100              2        3

df1.reset_index(drop=True, inplace

In [11]: pd.concat([df1, df2], join='inner', axis=1)
Out[11]: 
  chef_name dish_name  dish_price  dish_quantity  ratings dilevery_time  \
0       xyz   Chicken         120              1        4  21:39:37.265   
1       abc    Paneer         100              2        3  21:39:37.265   

  dispatch_time  source_lat  source_long Address  name  
0  21:47:37.265  -73.955741      40.3422   Dmart  John  
1  21:47:37.265  -73.955741      40.3422   Dmart  John