我要合并收入df中的profile_ID列和CompProfile df中的索引CommodityClass上的两个数据框。
收入
profile_ID type col1 col2
0 O-COMP-1006 Small_Off 4.1427e+07 4.0027e+07
1 O-COMP-1006 Small_Off 4.7915e+07 4.6515e+07
2 O-COMP-1006 Small_Off 6.10424e+07 5.96424e+07
3 O-COMP-1006 Small_Off 6.83726e+07 6.69726e+07
4 O-COMP-1008 Small_Off 7.28167e+07 7.14167e+07
5 O-COMP-1009 Small_Off 7.6147e+07 7.4747e+07
7 O-COMP-1006 Small_Off 8.02798e+07 7.88798e+07
8 O-COMP-1006 Small_Off 8.17172e+07 8.03172e+07
9 O-COMP-1006 Small_Off 8.42322e+07 8.28322e+07
10 O-COMP-1005 Small_Off 8.54957e+07 8.40747e+07
11 O-COMP-1006 Small_Off 8.67782e+07 8.53358e+07
12 O-COMP-1006 Small_Off 8.80798e+07 8.66159e+07
13 O-COMP-1007 Small_Off 8.9401e+07 8.79151e+07
14 O-COMP-1006 Small_Off 9.0742e+07 8.92338e+07
和CompProfile
col1 col2 col3
CommodityClass
profile_ID NaN NaN NaN
O-COMP-1001 0.0 0.0 0.0
O-COMP-1002 0.0 0.0 0.0
O-COMP-1003 0.0 0.0 0.0
O-COMP-1004 0.0 0.0 0.0
O-COMP-1005 0.0 0.0 0.0
O-COMP-1006 1.0 0.0 0.0
O-COMP-1007 0.0 0.0 1.0
O-COMP-1008 0.0 0.0 0.0
O-COMP-1009 0.0 1.0 0.0
我用
pd.merge( Income, CompProfile, how='left', \
left_on = 'profile_ID', right_index=True, \
suffixes = ("_USD","_frac") )
并得到一个错误
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
我同时检查了要合并的列和索引,它们都是object类型的。 我尝试使用join,但是遇到了相同的错误。
Income.join(CompProfile, on= 'profile_ID',lsuffix = "_USD",rsuffix = "_frac")
我还尝试重置CompProfile的索引并合并到列上
CompProfile.reset_index()
pd.merge( Income, CompProfile, how='left', \
left_on = 'profile_ID', right_on='CommodityClass', \
suffixes = ("_USD","_frac") )
在这种情况下,我得到了
KeyError: 'CommodityClass'
我还尝试了从CompProfile中删除“ profile_ID”行,但没有任何改变。
CompProfile.head(10).to_dict()
{'col1': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 1.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}, 'col2': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 1.0}, 'col3': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 1.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}}
答案 0 :(得分:0)
您的第一次尝试效果很好。以下是一个完整的工作示例。您需要退后一步,弄清楚为什么您的数据与下面提供的示例数据完全不同。
import pandas as pd
from numpy import nan
d1 = {'profile_ID': {0: 'O-COMP-1006', 1: 'O-COMP-1006', 2: 'O-COMP-1006', 3: 'O-COMP-1006', 4: 'O-COMP-1008', 5: 'O-COMP-1009', 7: 'O-COMP-1006', 8: 'O-COMP-1006', 9: 'O-COMP-1006', 10: 'O-COMP-1005'}, 'type': {0: 'Small_Off', 1: 'Small_Off', 2: 'Small_Off', 3: 'Small_Off', 4: 'Small_Off', 5: 'Small_Off', 7: 'Small_Off', 8: 'Small_Off', 9: 'Small_Off', 10: 'Small_Off'}, 'col1': {0: 41427000.0, 1: 47915000.0, 2: 61042400.0, 3: 68372600.0, 4: 72816700.0, 5: 76147000.0, 7: 80279800.0, 8: 81717200.0, 9: 84232200.0, 10: 85495700.0}, 'col2': {0: 40027000.0, 1: 46515000.0, 2: 59642400.0, 3: 66972600.0, 4: 71416700.0, 5: 74747000.0, 7: 78879800.0, 8: 80317200.0, 9: 82832200.0, 10: 84074700.0}}
d2 = {'col1': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 1.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}, 'col2': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 1.0}, 'col3': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 1.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}}
Income = pd.DataFrame.from_dict(d1)
CompProfile = pd.DataFrame.from_dict(d2)
res = pd.merge(Income, CompProfile, how='left',
left_on='profile_ID', right_index=True,
suffixes=('_USD', '_frac'))
print(res)
profile_ID type col1_USD col2_USD col1_frac col2_frac col3
0 O-COMP-1006 Small_Off 41427000.0 40027000.0 1.0 0.0 0.0
1 O-COMP-1006 Small_Off 47915000.0 46515000.0 1.0 0.0 0.0
2 O-COMP-1006 Small_Off 61042400.0 59642400.0 1.0 0.0 0.0
3 O-COMP-1006 Small_Off 68372600.0 66972600.0 1.0 0.0 0.0
4 O-COMP-1008 Small_Off 72816700.0 71416700.0 0.0 0.0 0.0
5 O-COMP-1009 Small_Off 76147000.0 74747000.0 0.0 1.0 0.0
7 O-COMP-1006 Small_Off 80279800.0 78879800.0 1.0 0.0 0.0
8 O-COMP-1006 Small_Off 81717200.0 80317200.0 1.0 0.0 0.0
9 O-COMP-1006 Small_Off 84232200.0 82832200.0 1.0 0.0 0.0
10 O-COMP-1005 Small_Off 85495700.0 84074700.0 0.0 0.0 0.0