Question

我有一个看起来像这样的网址：

url = 'http://localhost:8080/?q=abc%26def&other_params=here'

在浏览器中访问此URL将返回xml。

我试图通过lxml解析该网址的响应：

tree = etree.parse(url)

这里的问题是etree编码百分比char，而url将是

url = 'http://localhost:8080/?q=abc%2526def&other_params=here'

如果我不编码我的q参数的值，那么整个网址都会搞砸：

url = 'http://localhost:8080/?q=abc&def&other_params=here'

在发出请求之前，有什么办法可以告诉lxml不要在该网址中包含字符吗？

Answer 1

我说这是lxml网址处理中的一个错误，您应该检查lxml tracker中的现有报告，并报告它是否存在。

目前的解决方法是使用urllib2来检索您的网址：

import urllib2

resp = urllib2.urlopen(url)
tree = etree.parse(resp)