这对我有用:
import xml.etree.ElementTree as ET
from urllib2 import urlopen
url = 'http://example.com'
# this url points to a `xml` page
tree = ET.parse(urlopen(url))
然而,当我切换到requests
时,出现了问题:
import requests
import xml.etree.ElementTree as ET
url = 'http://example.com'
# this url points to a `xml` page
tree = ET.parse(requests.get(url))
引用错误如下所示:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 tree = ET.parse(requests.get(url, proxies={'http': '192.168.235.36:7788'}))
/usr/lib/python2.7/xml/etree/ElementTree.py in parse(source, parser)
1180 def parse(source, parser=None):
1181 tree = ElementTree()
-> 1182 tree.parse(source, parser)
1183 return tree
1184
/usr/lib/python2.7/xml/etree/ElementTree.py in parse(self, source, parser)
645 close_source = False
646 if not hasattr(source, "read"):
--> 647 source = open(source, "rb")
648 close_source = True
649 try:
TypeError: coercing to Unicode: need string or buffer, Response found
所以,我的问题是:在我的情况下requests
出了问题,我怎样才能让ET
与requests
一起工作?
答案 0 :(得分:3)
您正在将requests
响应对象传递给ElementTree;你想传递raw file object代替:
r = requests.get(url, stream=True)
ET.parse(r.raw)
.raw
返回'{file} like'套接字对象,ElementTree.parse()
将从中读取,就像它将从urllib2
响应中读取一样(它本身就像文件一样对象)。
具体例子:
>>> r = requests.get('http://www.enetpulse.com/wp-content/uploads/sample_xml_feed_enetpulse_soccer.xml', stream=True)
>>> tree = ET.parse(r.raw)
>>> tree
<xml.etree.ElementTree.ElementTree object at 0x109dadc50>
>>> tree.getroot().tag
'spocosy'
如果您有压缩的URL,原始套接字(如urllib2
)将返回未解码的压缩数据;在这种情况下,您可以使用binary response content上的ET.fromstring()
方法:
r = requests.get(url)
ET.fromstring(r.content)
答案 1 :(得分:0)
您没有向ElementTree提供响应文本,而是requests
Response
对象本身,这就是您收到类型错误的原因:need string or buffer, Response found
。这样做:
r = requests.get(url)
tree = ET.fromstring(r.text)