导入我的.txt文件时,收到以下错误:
Can't convert 'int' object to str implicitly
我导入我的.txt文件如下:
activityHeaders = ['Type', 'AccountID', 'ConID', 'SecurityID', 'Symbol', 'BBTicker', 'Currency', 'BaseCurrency', 'TradeDate', 'SettleDate', 'TransactionType', 'Quantity', 'UnitPrice', 'GrossAmount', 'SECFee', 'Commission', 'NetInBase', 'FXRatetoBase', 'Other1', 'Other2', 'Description']
dfActivity = pd.read_csv(activityFileUrl, skiprows=[1], header=activityHeaders, error_bad_lines=False)
和我的.txt文件如下所示:
"H","I000000","Activity","20100407","16:02:38","20100329","1.0"
"D","I000000","","","","","CAD","EUR","20100329","20100329","WITH","0","0","-14.88","0","0","0","-14.88","-10.8158","",,"CASH TRANSFER (INTERNAL)"
"D","I000000","","","","","AUD","EUR","20100328","20100328","ADJ","0","0","4","0","0","0","4","2.7211","",,"CLIENT FEE (U000001, Commission)"
"D","U000001","37036548","DE000A0F6MD5","PRA","STK","EUR","EUR","20100329","20100331","SELL","-300","7.776","-2332.8","0","-6","0","2326.8","2326.8","405346125","FI","TRADE PRAKTIKER BAU-UND HEIMWERK A"
我不明白int
可能来自哪里。请注意,我使用skiprows和error_bad_lines跳过第一行和最后一行。我也把标题也写为None,它返回了相同的错误。
答案 0 :(得分:2)
如果列表header
需要列名,则需要将names
更改为activityHeaders
。
df = pd.read_csv(StringIO(temp), names=activityHeaders, skiprows=1, error_bad_lines=False)
print (df)
Type AccountID ConID SecurityID Symbol BBTicker Currency \
D I000000 NaN NaN NaN NaN CAD EUR
D I000000 NaN NaN NaN NaN AUD EUR
D U000001 37036548.0 DE000A0F6MD5 PRA STK EUR EUR
BaseCurrency TradeDate SettleDate ... \
D 20100329 20100329 WITH ...
D 20100328 20100328 ADJ ...
D 20100329 20100331 SELL ...
Quantity UnitPrice GrossAmount SECFee Commission NetInBase \
D 0.000 -14.88 0 0 0 -14.88
D 0.000 4.00 0 0 0 4.00
D 7.776 -2332.80 0 -6 0 2326.80
FXRatetoBase Other1 Other2 Description
D -10.8158 NaN NaN CASH TRANSFER (INTERNAL)
D 2.7211 NaN NaN CLIENT FEE (U000001, Commission)
D 2326.8000 405346125.0 FI TRADE PRAKTIKER BAU-UND HEIMWERK A
[3 rows x 21 columns]
如果不需要跳过第二行省略skiprows
:
df = pd.read_csv(StringIO(temp), names=activityHeaders, error_bad_lines=False)
print (df)
Type AccountID ConID SecurityID Symbol BBTicker Currency \
0 H I000000 Activity 20100407 16:02:38 20100329 1.0
1 D I000000 NaN NaN NaN NaN CAD
2 D I000000 NaN NaN NaN NaN AUD
3 D U000001 37036548 DE000A0F6MD5 PRA STK EUR
BaseCurrency TradeDate SettleDate ... Quantity UnitPrice \
0 NaN NaN NaN ... NaN NaN
1 EUR 20100329.0 20100329.0 ... 0.0 0.000
2 EUR 20100328.0 20100328.0 ... 0.0 0.000
3 EUR 20100329.0 20100331.0 ... -300.0 7.776
GrossAmount SECFee Commission NetInBase FXRatetoBase Other1 \
0 NaN NaN NaN NaN NaN NaN
1 -14.88 0.0 0.0 0.0 -14.88 -10.8158
2 4.00 0.0 0.0 0.0 4.00 2.7211
3 -2332.80 0.0 -6.0 0.0 2326.80 2326.8000
Other2 Description
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 405346125.0 FI
[4 rows x 21 columns]
答案 1 :(得分:1)
完全删除headers参数。 headers
用作整数列表,告诉pandas将哪些行用作标题。
更改为skiprows=[0]
,您应该感觉良好。