我有一个CSV文件,其记录超过100k,在该记录中,一列的值以逗号分隔,我想对逗号分隔值进行排序。
示例数据:
"PT3QB789TSUIDF371261","THE TORONTO,DOMINION BANK","HZSN7FQBPO5IEWYIGC72","MAS,CA.ON.OSC,ASIC*,AAAA","XVCCCCCCCCCCYYUUUUU"
"11111111111111111111","ABC,XYZ,QWE","HZSN7FQBPO5IEWYIGC72","POU,ABC,MAS,CA.QC.OSC,CA.ON.OSC","XVRRRRRRRRTTTTTTTTTTTTT"
"22222222222222222222","BHC,NBC,MKY","HZSN7FQBPO5IEWYIGC72","BVC,AZX,CA.SK.FCAA,CA.NL.DSS","QQQQQQQQQRRCGHDKLKSLS"
正如您所看到的那样,第2列和第4列具有逗号分隔值,但我只想对第4列值进行排序。 所以我的输出应该如下所示:
"PT3QB789TSUIDF371261","THE TORONTO,DOMINION BANK","HZSN7FQBPO5IEWYIGC72","AAAA,ASIC*,CA.ON.OSC,MAS","XVCCCCCCCCCCYYUUUUU"
"11111111111111111111","ABC,XYZ,QWE","HZSN7FQBPO5IEWYIGC72","ABC,CA.ON.OSC,CA.QC.OSC,MAS,POU","XVRRRRRRRRTTTTTTTTTTTTT"
"22222222222222222222","BHC,NBC,MKY","HZSN7FQBPO5IEWYIGC72","AZX,BVC,CA.NL.DSS,CA.SK.FCAA","QQQQQQQQQRRCGHDKLKSLS"
我想写的代码如下:
#!/usr/bin/python
import csv
OUT_FILE = '/proj/ctc/temp/sanjay/REC-754/2017-05-29_IR_Position_Report_US_US_2017-05-30_out.csv'
IN_FILE = '/proj/ctc/temp/sanjay/REC-754/2017-05-29_IR_Position_Report_US_US_2017-05-30.csv'
f = open(IN_FILE, 'r')
o = open(OUT_FILE,'w')
with f:
reader = csv.reader(f)
with o:
writer = csv.writer(o)
for row in reader:
reportable_jurisdiction=row[68]
if ',' in reportable_jurisdiction:
row[68]=sorted(list(row[68].split(',')))
print " reportable Jurisdiction with comma "+reportable_jurisdiction
else:
print "reportable Jurisdiction if single "+reportable_jurisdiction
if(f.closed):
f=open(IN_FILE,"r")
if(o.closed):
o=open(OUT_FILE,"w")
writer.writerow(row)
print(row)
但是当我执行这个python脚本时,我得到以下错误:
$./Csvreader2.py
Traceback (most recent call last):
File "./Csvreader2.py", line 15, in <module>
for row in reader:
ValueError: I/O operation on closed file
我摆脱了这个问题,问题在于缩进代码 但是我得到了新的问题
Traceback (most recent call last):
File "./Csvreader2.py", line 14, in <module>
reportable_jurisdiction=row[68]
IndexError: list index out of range
在我的CSV文件中有超过100列,但为什么会出现此错误?
答案 0 :(得分:1)
您正在将这些文件用作上下文管理器:
with f:
# ...
with o:
# ...
with
块结束后(后续代码的缩进级别低于或等于with
语句缩进),文件已关闭且不能再读一遍。这意味着文件f
在with o:
行执行时关闭。
但是,您需要f
保持打开状态,以便您可以通过csv.reader
对象继续阅读它。将所有代码放在一个 with
语句下,同时保持两个文件同时打开:
with f, o:
reader = csv.reader(f)
writer = csv.writer(o)
for row in reader:
# ...