所以我试图打开并读取没有字段名称的csv文件。根据我所做的研究,我很确定它是用UTF-8编码的。我的csv有这种格式:
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
我使用以下内容打开并阅读它:
def parseCSVCounter(csv_file):
with codecs.open(csv_file, "r", "utf-8-sig","strict", -1) as f:
f = str(f)
relayreader = csv.reader(f, delimiter=',')
for row in relayreader:
print(row)
try:
#row[0] = unicode(row[0], 'latin-1')
counter(row)
print('starting row..')
except UnicodeDecodeError, e:
print('something went wrong1')
print e
except Exception, e:
print('something went wrong')
print e
这会生成
Starting Command..
['<']
something went wrong
invalid literal for int() with base 10: '<'
['o']
something went wrong
invalid literal for int() with base 10: 'o'
........
starting row..
['9']
starting row..
['3']
starting row..
['8']
starting row..
['2']
starting row..
['8']
starting row..
['>']
something went wrong
invalid literal for int() with base 10: '>'`
我减少了这一点以证明我的观点。它似乎会自动为我生成字段名称。使用csv.DictReader(fieldnames = 'foo')
,我可以在序列中指定字段名称。如何让csv.reader()
忽略缺少字段名称?
答案 0 :(得分:3)
你不需要致电str(f)
;直接使用文件对象 :
with codecs.open(csv_file, "r", "utf-8-sig", "strict") as f:
relayreader = csv.reader(f, delimiter=',')
您正在尝试将str(f)
的输出读作CSV文件,而这是一个以下形式的字符串:
<open file '/path/to/file', mode 'rb' at 0x105f10d20>
您可以从错误输出中看到;它拼写出<
,o
等,一直到内存地址数字和结束>
。
请注意,utf-8-sig
编解码器可以处理文件开头存在的UTF-8编码BOM,但除非预期该BOM存在,否则正常{{1}编解码器就可以了。