生产中的lxml编码错误

时间:2012-04-09 10:21:24

标签: python google-app-engine lxml

我正在尝试使用lxml处理一些数据。它在我的开发服务器上工作正常,但在生产时使用以下代码:

parser = etree.XMLParser(encoding='cp1251')

抛出:

  File "parser.pxi", line 1288, in lxml.etree.XMLParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:77726)
  File "parser.pxi", line 738, in lxml.etree._BaseParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:73404)
LookupError: unknown encoding: 'cp1251'

我正在使用lxml 2.3。 GAE似乎支持相同的版本。那么为什么会出现这个错误?

修改

我为XMLParser指定了不同的编码,例如cp1252,ISO-8859-5,ISO-8859-2,它总是在GAE上抛出相同的错误,但在我的本地机器上运行。这些是流行的编码,GAE上的lxml必须支持它们。我相信这对于GAE上的lxml构建是错误的。

我创建了一个问题:http://code.google.com/p/googleappengine/issues/detail?id=7315

EDIT2

完整追溯:

unknown encoding: 'cp1251'
Traceback (most recent call last):
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__
    rv = self.handle_exception(request, response, e)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__
    rv = self.router.dispatch(request, response)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__
    return handler.dispatch()
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~my_cool_app_id/1.358126884781269352/main.py", line 29, in get
    parser = etree.XMLParser(encoding='cp1251')
  File "parser.pxi", line 1288, in lxml.etree.XMLParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:77726)
  File "parser.pxi", line 738, in lxml.etree._BaseParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:73404)
LookupError: unknown encoding: 'cp1251'

1 个答案:

答案 0 :(得分:1)

在OS X上似乎有一个关于此行为的错误,其中指定encoding =“cp1252”导致上述错误。评论还指定其他系统受影响:https://bugs.launchpad.net/lxml/+bug/707396

您是否尝试过指定其他编码类型? (看看它是否只是cp1252的一个问题)