DjangoUnicodeDecodeError编解码器无法解码位置上的字节0xdd

时间:2011-09-28 11:53:23

标签: python django apache unicode mod-wsgi

我在带有Centos 5.5的专用服务器中运行带有mod_wsgi的Apache的Django Web应用程序。

然而,有时(一天六到七次)它开始为一些随机页面提供500错误。但是如果我刷新页面两三次,页面就会变为正常。 但是一旦它开始出现500错误,我网站上的每个其他页面都会出错。

重新启动apache之后,它会恢复正常5或6小时,但它永远不会结束以发出错误。

我粘贴下面的完整错误日志,但实际上它说的是 DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0xdd in position 330: invalid continuation byte.

可能导致此类错误的原因是什么?如何检测? 我可以尽快提供任何其他信息。

PS:我的本地PC(Apache-mod_wsgi)with Win7上有相同的设置,我从来没有遇到过这个错误。

[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30] mod_wsgi (pid=30331): Exception occurred processing WSGI script '/var/www/html/MY_SITE/django.wsgi'., referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30] Traceback (most recent call last):, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 273, in __call__, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     response = self.get_response(request), referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/core/handlers/base.py", line 169, in get_response, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     response = self.handle_uncaught_exception(request, resolver, sys.exc_info()), referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/core/handlers/base.py", line 203, in handle_uncaught_exception, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     return debug.technical_500_response(request, *exc_info), referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/views/debug.py", line 59, in technical_500_response, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     html = reporter.get_traceback_html(), referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/views/debug.py", line 117, in get_traceback_html, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     frame['vars'] = [(k, force_escape(pprint(v))) for k, v in frame['vars']], referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/template/defaultfilters.py", line 34, in _dec, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     args[0] = force_unicode(args[0]), referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]   File "/opt/python2.7.1/lib/python2.7/site-packages/django/utils/encoding.py", line 93, in force_unicode, referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30]     raise DjangoUnicodeDecodeError(s, *e.args), referer: http://www.MY_SITE.com/
[Wed Sep 28 12:03:53 2011] [error] [client 46.104.250.30] DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0xdd in position 330: invalid continuation byte. You passed in "<WSGIRequest\\nGET:<QueryDict: {}>,\\nPOST:<QueryDict: {}>,\\nCOOKIES:{},\\nMETA:{'CSRF_COOKIE': '041ed0a93c4b355d4861a0662d49fcb4',\\n 'DOCUMENT_ROOT': '/var/www/html/MY_SITE',\\n 'GATEWAY_INTERFACE': 'CGI/1.1',\\n 'HTTP_ACCEPT': 'application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5',\\n 'HTTP_ACCEPT_ENCOD\\xddNG': 'gzip, deflate',\\n 'HTTP_ACCEPT_LANGUAGE': 'en-us',\\n 'HTTP_CACHE_CONTROL': 'max-age=0',\\n 'HTTP_CONNECT\\xddON': 'keep-alive',\\n 'HTTP_COOK\\xddE': 'csrftoken=10bc570d4ef77b17ce580106dafa9fb6; sessionid=60fb98634573194f7f5e18ef6014f59b',\\n 'HTTP_HOST': 'www.MY_SITE.com',\\n 'HTTP_REFERER': 'http://www.MY_SITE.com/',\\n 'HTTP_USER_AGENT': 'Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B367 Safari/531.21.10',\\n 'PATH_INFO': u'/main/faq/',\\n 'PATH_TRANSLATED': '/var/www/html/MY_SITE/django.wsgi/main/faq/',\\n 'QUERY_STRING': '',\\n 'REMOTE_ADDR': '46.104.250.30',\\n 'REMOTE_PORT': '49643',\\n 'REQUEST_METHOD': 'GET',\\n 'REQUEST_URI': '/main/faq/',\\n 'SCRIPT_FILENAME': '/var/www/html/MY_SITE/django.wsgi',\\n 'SCRIPT_NAME': u'',\\n 'SERVER_ADDR': '93.94.251.82',\\n 'SERVER_ADMIN': 'bilgi@MY_SITE.com',\\n 'SERVER_NAME': 'www.MY_SITE.com',\\n 'SERVER_PORT': '80',\\n 'SERVER_PROTOCOL': 'HTTP/1.1',\\n 'SERVER_SIGNATURE': '<address>Apache/2.2.3 (CentOS) Server at www.MY_SITE.com Port 80</address>\\\\n',\\n 'SERVER_SOFTWARE': 'Apache/2.2.3 (CentOS)',\\n 'mod_wsgi.application_group': 'MY_SITE.com|',\\n 'mod_wsgi.callable_object': 'application',\\n 'mod_wsgi.handler_script': '',\\n 'mod_wsgi.input_chunked': '0',\\n 'mod_wsgi.listener_host': '',\\n 'mod_wsgi.listener_port': '80',\\n 'mod_wsgi.process_group': '',\\n 'mod_wsgi.request_handler': 'wsgi-script',\\n 'mod_wsgi.script_reloading': '1',\\n 'mod_wsgi.version': (3, 3),\\n 'wsgi.errors': <mod_wsgi.Log object at 0x2b7d75ddbfb0>,\\n 'wsgi.file_wrapper': <built-in method file_wrapper of mod_wsgi.Adapter object at 0x2b7d75f12a80>,\\n 'wsgi.input': <mod_wsgi.Input object at 0x2b7d75fa0a30>,\\n 'wsgi.multiprocess': True,\\n 'wsgi.multithread': False,\\n 'wsgi.run_once': False,\\n 'wsgi.url_scheme': 'http',\\n 'wsgi.version': (1, 1)}>" (<type 'str'>), referer: http://www.MY_SITE.com/

该网站在本地计算机(Win7,Apache,mod_wsgi)的Apache和Django内置开发者服务器上无缝工作

2 个答案:

答案 0 :(得分:2)

在此请求中,客户端正在发送<WSGIRequest>,其中包含客户端标头'HTTP_ACCEPT_ENCOD\\xddNG': 'gzip, deflate'

如果<WSGIRequest>的实际编码应该是UTF-8,那么服务器错误是合法的(除非这不是客户端发送的)。在UTF-8中,字节值0xdd只能用作双字节字符编码的第一个,在这种情况下,下一个字节必须具有前导位10。但是0xdd后面的字节有一个前导0位,所以这是使用UTF-8的解码错误。

如果<WSGIRequest>的实际编码是其他的,那么服务器错误可能不合法,因为客户端头可能被解释为'HTTP_ACCEPT_ENCODÝNG': 'gzip, deflate'(在ISO-8859-1的情况下)并且忽略了。

尝试识别在这些请求中发送的特定客户端。

答案 1 :(得分:1)

你有两个问题。

(1)您的服务器与此用户之间的某些内容会破坏用户向您发送的标头,将HTTP_CONNECTION转为HTTP_CONNECTÝON。这种事情通常是由被误导的移动互联网提供商有时使用的更古老和脑死亡的网络代理完成的。

在这种情况下,他们甚至会搞砸你的HTTP_COOKIE,即使你解决了另一个问题,这肯定会使你的应用程序无法正常运行:

(2)Django中的一个错误是它尝试读取内部具有非ASCII /非UTF-8字节序列的头名称时出现异常。 HTTP明确定义标头名称在ISO-8859-1中表示,因此Django应该使用此编码将标头名称转换为Unicode而不是UTF-8。所有字节序列在ISO-8859-1中都有效,因此永远不应存在UnicodeDecodeError。

实际上,没有标头使用非ASCII名称,浏览器处理非ASCII标头值是一个颠簸和不一致的过程。但是,Django应该允许并忽略伪造的标题。