即使在使用utf-8编码后也会出现ascii代码错误

时间:2013-12-10 10:45:17

标签: python google-app-engine encoding utf-8

我发布了github的api for markdown,并在post请求中发送json数据。我发现我不能写列表,因为这些字符不是ascii的一部分,并且查找它应该总是编码。我编码了需要标记的文本,api正在工作,但是当我尝试制作列表时,我仍然遇到同样的错误。

POST方法的代码是:

def markDown(to_mark):
    headers = {
        'content-type': 'application/json'
    }
    text = to_mark.decode('utf8')
    payload = {
        'text': text,
        'mode':'gfm'
    }
    data = json.dumps(payload)
    req = urllib2.Request('https://api.github.com/markdown', data, headers)
    response = urllib2.urlopen(req)
    marked_down = response.read()
    return marked_down

我尝试制作列表时得到的错误如下:

'ascii' codec can't decode byte 0xe2 in position 55: ordinal not in range(128)

添加完整的追溯:

Traceback (most recent call last):
    File "/home/bigb/Programming/google_appengine/google/appengine/runtime/wsgi.py", line 266, in Handle
      result = handler(dict(self._environ), self._StartResponse)
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 1519, in __call__
      response = self._internal_error(e)
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 1511, in __call__
      rv = self.handle_exception(request, response, e)
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 1505, in __call__
      rv = self.router.dispatch(request, response)
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
      return route.handler_adapter(request, response)
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 1077, in __call__
      return handler.dispatch()
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 547, in dispatch
      return self.handle_exception(e, self.app.debug)
    File "/home/bigb/Programming/google_appengine/lib/webapp2-2.3/webapp2.py", line 545, in dispatch
      return method(*args, **kwargs)
    File "/home/bigb/Programming/Blog/my-ramblings/blog.py", line 232, in post
      mark_blog = markDown(blog)
    File "/home/bigb/Programming/Blog/my-ramblings/blog.py", line 43, in markDown
      text = to_mark.decode('utf8')
    File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
      return codecs.utf_8_decode(input, errors, True)
  UnicodeEncodeError: 'ascii' codec can't encode characters in position 45-46: ordinal not in range(128)

我在这里理解错了吗?谢谢!

3 个答案:

答案 0 :(得分:1)

您的to_mark值不是Unicode值;你已经有编码的字节字符串了。尝试编码字节字符串告诉Python它应该在再次编码之前首先解码值为Unicode。这会导致您的异常:

>>> '\xc3\xa5'.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

对于json.dumps()功能,您希望使用Unicode值。如果to_mark包含UTF-8数据,请使用str.decode()

text = to_mark.decode('utf8')

答案 1 :(得分:1)

您的代码段显示:

text = to_mark.encode('utf-8')

但在追溯中你有:

File "/home/bigb/Programming/Blog/my-ramblings/blog.py", line 43, in markDown
    text = to_mark.decode('utf8')

请先确保发布实际代码和回溯(即:发布实际引发异常的代码)。

答案 2 :(得分:0)

我记不清楚了,但是当我遇到完全相同的错误时,可能会在response.read()上使用decode / encode。但

response.read().decode("utf8")