Question

使用以下代码：

import xml.etree.cElementTree as ET
tree = ET.parse(r'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok')

我收到错误消息：

IOError                                   Traceback (most recent call last)
<ipython-input-10-d91d452da3e7> in <module>()
----> 1 tree = ET.parse(r'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok')

<string> in parse(source, parser)

<string> in parse(self, source, parser)

IOError: [Errno 22] invalid mode ('rb') or filename: 'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok'

但是，如果我在浏览器中打开上面的链接，并将其保存到XML文件（people.xml），然后执行：

tree = ET.parse(r'C:\Users\Eric\Downloads\people.xml')
tree.getroot()

我得到了结果：＆lt; Element＆＃39; people＆＃39;在0x00000000086AA420＆gt;

有关使用链接的原因的任何线索都不起作用？谢谢:)）

Answer 1

文件系统中的任何位置都没有该名称的文件。 etree并不明白这是一个真正的网址，即使它确实无法对它做任何事情。

相反，你应该做类似的事情：

import xml.etree.cElementTree as ET
import urllib2, StringIO

page_with_xml = urllib2.urlopen(r'https://apitest.batchbook.com/api/v1/people.xml?auth_token=GR5doLv88FrnLyLGIwok')
io_xml = StringIO.StringIO()
io_xml.write(page_with_xml.read())
io_xml.seek(0)
tree = ET.parse(io_xml)

编辑以纠正etree.parse正在寻找类文件对象的事实。不是特别优雅，但它完成了工作。

Python ElementTree XML IOError：[Errno 22]无效模式（＆＃39; rb＆＃39;）或文件名

1 个答案: