Question

我在企业代理后面试图使用lxml。我找不到lxml的代理支持的任何参考（就像urllib2一样）。 lxml是否有能力通过代理联系？有没有解决方法？

Answer 1

所以你需要使用代理下载一些xml然后使用lxlm解析它，对吗？

首先使用python请求库下载xml页面。它有代理支持：

import requests

proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}

requests.get("http://example.org", proxies=proxies)

有关其设置的更多信息： http://docs.python-requests.org/en/latest/user/advanced/#proxies

然后使用lxml来解析它。或者使用BeautifulSoup4，它可能更适合您的需求。如果安装了它，它将使用lxml作为其解析引擎。用法示例：

from bs4 import BeautifulSoup

html = "<body></body>"
x = BeautifulSoup(html, "xml")         # Note the xml as second argument.

lxml是否有代理支持？

1 个答案: