我正在尝试创建一个排除NULL字节的新版本文件。我正在使用下面的代码尝试这一点,但它仍然打破了NULL字节。我应该如何构造for
语句和try-catch
块以在异常后继续运行?
import csv
input_file = "/data/train.txt"
outFileName = "/data/train_no_null.txt"
############################
i_f = open( input_file, 'r' )
reader = csv.reader( i_f , delimiter = '|' )
outFile = open(outFileName, 'wb')
mywriter = csv.writer(outFile, delimiter = '|')
i_f.seek( 0 )
i = 1
for line in reader:
try:
i += 1
mywriter.writerow(line)
except csv.Error:
print('csv choked on line %s' % (i + 1))
pass
编辑:
以下是错误消息:
Traceback (most recent call last):
File "20150310_rewrite_csv_wo_NULL.py", line 26, in <module>
for line in reader:
_csv.Error: line contains NULL byte
更新:
我正在使用此代码:
i_f = open( input_file, 'r' )
reader = csv.reader( i_f , delimiter = '|' )
# reader.next()
outFile = open(outFileName, 'wb')
mywriter = csv.writer(outFile, delimiter = '|')
i_f.seek( 0 )
i = 1
for idx, line in enumerate(reader):
try:
mywriter.writerow(line)
except:
print('csv choked on line %s' % idx)
现在出现此错误:
Traceback (most recent call last):
File "20150310_rewrite_csv_wo_NULL.py", line 26, in <module>
for idx, line in enumerate(reader):
_csv.Error: line contains NULL byte
答案 0 :(得分:0)
您可以使用以下代码捕获所有错误...
for idx, line in enumerate(reader):
try:
mywriter.writerow(line)
except:
print('csv choked on line %s' % idx)
答案 1 :(得分:0)
读取器抛出异常,因为它不在try / catch之外被捕获。
但即便如此,读者在遇到NUL字节后也不想继续。但是,如果读者从未见过它,那就是......
for idx, line in enumerate(csv.reader((line.replace('\0','') for line in open('myfile.csv')), delimiter='|')):
你可能没事。
但实际上,你应该找出NUL字节的来源,因为它们可能是你数据更广泛问题的症状。