我正在尝试将数据框架与主数据框架合并。我已经成功合并了多个数据框以创建主数据框。当我尝试合并时,有一个数据框会引起问题。
获取当前主机框架的代码:
postransaction_df.PROD_NBR = postransaction_df.PROD_NBR.astype(float)
postprod_df = pd.merge(products_df, postransaction_df, on='PROD_NBR')
postcat_df = pd.merge(postprod_df, major_product_categories_df, on='MAJOR_CAT_CD')
主框架:
postcat_df
Out[40]:
PROD_NBR PROD_DESC MAJOR_CAT_CD \
0 -7.358821e+10 VAL BABYS 1ST GENERAL 9687
1 -7.358821e+10 VAL BABYS 1ST GENERAL 9687
2 -7.204736e+10 CARD VAL ANYONE 9687
3 -7.204736e+10 CARD VAL ANYONE 9687
4 -7.204736e+10 CARD VAL ANYONE 9687
... ... ...
878509 8.940460e+10 ADVOCARE REDICODE PLUS DME STRIP 50CT 2343
878510 8.940460e+10 ADVOCARE REDICODE PLUS DME STRIP 50CT 2343
878511 8.940460e+10 ADVOCARE REDICODE PLUS DME STRIP 50CT 2343
878512 8.940460e+10 ADVOCARE REDICODE PLUS DME STRIP 50CT 2343
878513 8.940460e+10 ADVOCATE REDICODE TALKING GLUCOSE METER 2343
BSKT_ID PHRMCY_NBR SLS_DTE_NBR \
0 600010665100006106120160128 748613589991092598 20160128
1 600010665100006202720160208 748613589991092598 20160208
2 300000003998234235982 1174450154022548624 20160211
3 300000003787577235982 1174450154022548624 20160209
4 300000003792067235982 1174450154022548624 20160211
... ... ...
878509 600010687700002715520160312 1360787588063411417 20160312
878510 600010687700003139020160528 1360787588063411417 20160528
878511 600010687700002377820160111 1360787588063411417 20160111
878512 600010687700002814520160331 1360787588063411417 20160331
878513 600010687700002871320160412 1360787588063411417 20160412
EXT_SLS_AMT SLS_QTY MAJOR_CAT_DESC
0 1.25 1 GREETING CARDS
1 1.25 1 GREETING CARDS
2 1.99 1 GREETING CARDS
3 1.99 1 GREETING CARDS
4 1.99 1 GREETING CARDS
... ... ...
878509 24.00 2 DIABETES
878510 24.00 2 DIABETES
878511 12.00 1 DIABETES
878512 12.00 1 DIABETES
878513 10.00 1 DIABETES
麻烦的框架:
pharmacy_df
Out[41]:
PHRMCY_NBR PHRMCY_NAM ST_CD
0 1.017330e+18 GNP PHARMACY #1 NJ
1 1.041420e+18 GNP PHARMACY #2 NJ
2 1.048830e+18 GNP PHARMACY #3 MA
3 1.057350e+18 GNP PHARMACY #4 NJ
4 1.058510e+18 GNP PHARMACY #5 NY
... ... ...
1092 9.471890e+17 GNP PHARMACY #1093 PA
1093 9.657430e+17 GNP PHARMACY #1094 PA
1094 9.671640e+16 GNP PHARMACY #1095 PA
1095 9.686930e+17 GNP PHARMACY #1096 PR
1096 9.741830e+17 GNP PHARMACY #1097 NJ
我的将框架合并在一起的代码:
pharmtotal_df = pd.merge(postcat_df, pharmacy_df, on='PHRMCY_NBR')
上次合并的结果:
pharmtotal_df
Out[43]:
Empty DataFrame
Columns: [PROD_NBR, PROD_DESC, MAJOR_CAT_CD, BSKT_ID, PHRMCY_NBR, SLS_DTE_NBR, EXT_SLS_AMT, SLS_QTY, MAJOR_CAT_DESC, PHRMCY_NAM, ST_CD]
Index: []
有人知道如何合并而不会导致其创建空数据框的问题吗?
非常感谢您的帮助。
答案 0 :(得分:0)
因为表的键之间不匹配,或者键的dtype不匹配。尝试使用以下代码:
postprod_df = pd.merge(products_df.assign(PROD_NBR=lambda d: d.PROD_NBR.astype(int)),
postransaction_df.assign(PROD_NBR=lambda d: d.PROD_NBR.astype(int)),
on='PROD_NBR')
postcat_df = pd.merge(postprod_df.assign(MAJOR_CAT_CD=lambda d: d.MAJOR_CAT_CD.astype(int)),
major_product_categories_df.assign(MAJOR_CAT_CD=lambda d: d.MAJOR_CAT_CD.astype(int)),
on='MAJOR_CAT_CD')