Question

我正在尝试使用ElementTree python库导入从Scopus下载的XML文件。

这是我实际代码的两个片段，都返回相同的错误：

1）

import urllib2
import xml.etree.ElementTree as ET

url = 'https://api.elsevier.com/content/search/scopus?query=au-id(' + author_id + ')&apiKey=' + apiKey_standard    
xml = urllib2.urlopen(url).read()
tree = ET.fromstring(xml)

2）

import urllib2
import xml.etree.ElementTree as ET

url = 'https://api.elsevier.com/content/search/scopus?query=au-id(' + author_id + ')&apiKey=' + apiKey_standard    
xml = urllib2.urlopen(url)
tree = ET.parse(xml)

错误：

xml.etree.ElementTree.ParseError：格式不正确（无效令牌）：第1行，第0列

如果我从代码段1）打印print xml[0]，我会{。

似乎urllib2 read方法返回一个json对象。

Answer 1

将＆amp; httpAccept = application％2Fatom％2Bxml 参数添加到您的查询中，以指定所需的格式：

url = 'https://api.elsevier.com/content/search/scopus?query=au-id(' + author_id + ')&apiKey=' + apiKey_standard + '&httpAccept=application%2Fatom%2Bxml'

无法正确解析从SCOPUS API检索到的XML文件

1 个答案: