Question

我遇到了臭名昭着的UnicodeEncodeError。我在这里发布之前已经研究了很多关于这个错误的内容，显然有多个版本的错误。

这是我的代码：

import nltk,re,pprint
import geograpy
import codecs
from nameparser.parser import HumanName

text_file = codecs.open('H:/Study and Work/Marciano Asstship/Work/Incident_cards_data/2.Text Extracted/Boxes 1 - 21/Boxes 8 thru 21-TULE LAKE/Box9-TULE LAKE/box9.txt',encoding = 'utf-8')
text_data = text_file.read()

places = geograpy.get_place_context(text_data)
print places

这是错误：

log.debug('%s on %s' % (e, url))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 619: ordinal not in range(128)

Answer 1

e或url是一个Unicode字符串，它使得结果字符串传递给log.debug一个Unicode字符串。例子：

>>> '%s %s' % ('abc','def')
'abc def'
>>> '%s %s' % (u'abc','def')
u'abc def'
>>> '%s %s' % ('abc',u'def')
u'abc def'
>>> '%s %s' % (u'abc',u'def')
u'abc def'

可能log.debug并不期望Unicode字符串，因此Python 2使用默认的ascii编解码器将其隐式转换为字节字符串。生成的Unicode字符串不与ASCII兼容，因此UnicodeEncodeError。

UnicodeEncodeError：'ascii'编解码器无法对位置619中的字符u'\ xa3'进行编码：序数不在范围内（128）

1 个答案: