我正在尝试写入文件,但我收到以下错误:
Traceback (most recent call last):
File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/merge-395780681.888.py", line 151, in <module>
gc_all_d.writerow(row)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 148, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0329' in position 5: ordinal not in range(128)
我尝试将辅导员数据库中的一行写入汇总其姓名的文件后发生错误:
# compile master spreadsheet
with(open('gc_all.txt_3','w')) as gc_all:
gc_all_d = csv.DictWriter(gc_all, fieldnames = fieldnames, extrasaction='ignore', delimiter = '\t')
gc_all_d.writeheader()
for row in aicep_l:
print row['name']
gc_all_d.writerow(row)
for row in nbcc_l:
gc_all_d.writerow(row)
print row['name']
我在这里不熟悉的水域。我没有在writerow()方法中看到一个参数,它可以将编码范围扩展到这个字符'\ u0329'。
我认为这个错误可能与我使用nameparser模块将所有辅导员的名字组织成相同格式的事实有关。从nameparser导入的HumanName函数可能会写出辅导员的名字,带有一个前导'u'来表示unicode,这意味着无法识别总输出u'Sam the Man'而不是'Sam the Man'。
感谢您的帮助!
根据答案修正后的错误:
File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/merge-395782963.700.py", line 153, in <module>
row['name'] = row['name'].encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 11: ordinal not in range(128)
使所有名称条目统一的代码:
# nbcc
with(open('/Users/samuelfinegold/Documents/noodle/gc/nbcc/nbcc_output.txt', 'rU')) as nbcc:
nbcc_d = csv.DictReader(nbcc, delimiter = '\t')
nbcc_l = []
for row in nbcc_d:
# name = HumanName(row['name'])
# row['name'] = name.title + ' ' + name.first + ' ' + name.middle + ' ' + name.last + ' ' + name.suffix
row['phone'] = row['phone'].translate(None, whitespace + punctuation)
nbcc_l.append(row)
修改后的代码:
# compile master spreadsheet
with(open('gc_all.txt_3','w')) as gc_all:
gc_all_d = csv.DictWriter(gc_all, fieldnames = fieldnames, extrasaction='ignore', delimiter = '\t')
gc_all_d.writeheader()
for row in nbcc_l:
row['name'] = row['name'].encode('utf-8')
gc_all_d.writerow(row)
错误:
Traceback (most recent call last):
File "/private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/merge-395784700.086.py", line 153, in <module>
row['name'] = row['name'].encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 11: ordinal not in range(128)
logout
答案 0 :(得分:4)
来自docs:
此版本的csv模块不支持Unicode输入。此外,目前有一些关于ASCII NUL字符的问题。因此,所有输入应为UTF-8或可打印的ASCII以确保安全;请参阅示例部分中的示例。
在编写数据之前,您需要对数据进行编码 - 例如:
for row in aicep_1:
print row['name']
for key, value in row.iteritems():
row[key] = value.encode('utf-8')
gc_all_d.writerow(row)
或者,因为你在2.7,你可以使用词典理解:
for row in aicep_1:
print row['name']
row = {key, value.encode('utf-8') for key, value in row.iteritems()}
或者在文档的示例页面上使用一些更复杂的模式。
答案 1 :(得分:2)
你所拥有的是一个输出流(你的gc_all.txt_3
文件,在with
行打开,变量gc_all
中的流实例),Python认为它必须只包含ASCII。您已要求它使用Unicode字符'\ u0329'编写Unicode字符串。例如:
>>> s = u"foo\u0329bar"
>>> with open('/tmp/unicode.txt', 'w') as stream: stream.write(s)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0329' in position 3:
ordinal not in range(128)
您有很多选项,包括在每个字符串上执行显式.encode
。或者,您可以使用http://docs.python.org/2/howto/unicode.html中所述的codecs.open
打开文件(我假设Python 2.x,3.x有点不同):
>>> import codecs
>>> with codecs.open('/tmp/unicode.txt', 'w', encoding='utf-8') as stream:
... stream.write(s)
...
>>>
编辑添加:根据@Peter DeGlopper的回答,显式encode
可能更安全。 UTF-8的编码没有NUL,所以假设你想要UTF-8,通常就是这样,那么可能就好了。