Pandas合并在非唯一列上会产生不同的结果

时间:2017-08-30 09:53:40

标签: python pandas

我有两个我想加入的数据框:

第一个(产品)如下所示:

category,deleted,availability,approval,quality_issue,expiring,merchant_id,date,product_id
alimentation & boissons,-2.0,in stock,0,0.0,0.0,11061,08/30/2017,7997
alimentation & boissons,-2.0,in stock,approved,-2.0,0.0,11061,08/30/2017,65
alimentation & boissons,1.0,in stock,0,0.0,0.0,11061,08/30/2017,2186

然后我还有一个(帐户),如下所示:

merchant_id, status
11060, 0
11061, Not performing automatic item updates for availability

理想情况下,我最后想要的是一个数据框,其中包含所有值(在末尾添加状态作为列):

category,deleted,availability,approval,quality_issue,expiring,merchant_id,date,product_id, status
    alimentation & boissons,-2.0,in stock,0,0.0,0.0,11061,08/30/2017,7997, Not performing automatic item updates for availability
    alimentation & boissons,-2.0,in stock,approved,-2.0,0.0,11061,08/30/2017,65, Not performing automatic item updates for availability
    alimentation & boissons,1.0,in stock,0,0.0,0.0,11061,08/30/2017,2186, Not performing automatic item updates for availability

我正在使用这种合并类型:

merged = pd.merge(account_statuses_agg,
                  account_statuses_df,
                         left_on='merchant_id',
                         right_on='merchant_id',
                         how='outer'
                         ).replace(np.nan, 0)

然而,我得到的结果是:

category,deleted,availability,approval,quality_issue,expiring,merchant_id,date,product_id, status
    alimentation & boissons,-2.0,in stock,0,0.0,0.0,11061,08/30/2017,7997, 0
    alimentation & boissons,-2.0,in stock,approved,-2.0,0.0,11061,08/30/2017,65, 0
    alimentation & boissons,1.0,in stock,0,0.0,0.0,11061,08/30/2017,2186, 0
0 ,0 , 0 ,0 ,0 ,0 11061, 0 ,0 ,  Not performing automatic item updates for availability

合并正在发生的事情是什么?是否每个值都是唯一的,以便合并/加入pandas?

非常感谢!

0 个答案:

没有答案