Lxml无法解析gzip压缩的XML?

时间:2012-10-15 19:36:34

标签: python lxml

我有这个Gzipped XML文件: http://cdon.com/xml_files/cdon_games_SE.xml.gz

根据lxml http://lxml.de/parsing.html lxml可以解析gzip压缩的XML文件: “lxml可以从本地文件,HTTP URL或FTP URL进行解析。它还可以自动检测和读取gzip压缩的XML文件(.gz)。”

此代码:

from lxml import etree
tree = urllib.urlopen('http://cdon.com/xml_files/cdon_games_SE.xml.gz')
parser = etree.XMLParser(recover=True)
tree = etree.parse(tree, parser)
tree = tree.xpath(//product)

给出错误:

tree = tree.xpath(//product)
  File "lxml.etree.pyx", line 2038, in lxml.etree._ElementTree.xpath (src/lxml\lxml.etree.c:47529)
  File "lxml.etree.pyx", line 1709, in lxml.etree._ElementTree._assertHasRoot (src/lxml\lxml.etree.c:44508)
AssertionError: ElementTree not initialized, missing root

有什么问题?无法lxml解析gzip压缩的XML文件?如果我将文件保存在xml(不带gzip)作为本地服务器上的文件,则可以正常工作。

1 个答案:

答案 0 :(得分:0)

以上网址返回正确的mime-type。您是否尝试下载该文件并将其保留为.xml.gz,以查看lxml在文件和请求句柄上的工作方式是否有所不同?