对于elementtree来说非常新,所以我正在尝试为xbmc解析tv addon的xml文件。下面是我遇到问题的代码。我认为我的xpath不正确,占位符没有处理该属性!
这是我正在使用的xml文件 - http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930
seasonnum = root2.findall("/Show/Episodelist/Season[@no='%s']/episode/seasonnum" % (season))
import xml.etree.ElementTree as ET
import urllib
tree2 = ET.parse(urllib.urlopen(url))
root2 = tree2.getroot()
seasonnum = tree2.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % '1')
print seasonnum
SyntaxError:期望路径分隔符([]是我得到的
答案 0 :(得分:2)
使用ElementTree:
>>> from xml.etree import ElementTree
>>> import urllib2
>>> url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
>>> request = urllib2.Request(url, headers={"Accept" : "application/xml"})
>>> u = urllib2.urlopen(request)
>>> tree = ElementTree.parse(u)
>>> rootElem = tree.getroot()
>>> [s.text for s in rootElem.findall('.//Season[@no="2"]/episode/seasonnum')]
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14',
'15', '16', '17', '18', '19', '20', '21', '22']
答案 1 :(得分:1)
根据xml.etree.ElementTree
documentation - XPath support:
此模块为XPath表达式提供有限的支持 在树中定位元素。目标是支持一小部分 缩写语法;完整的XPath引擎超出了范围 模块。
您可能需要像lxml
这样的第三方库来使用XPath。
示例:
>>> import lxml.etree
>>>
>>> url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
>>> tree = lxml.etree.parse()
>>> tree.xpath("/Show/Episodelist/Season[@no='%s']/episode/seasonnum/text()" % 1)
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
<强>更新强>
要使用lxml.etree.ElementTree
,应略微修改xpath:
>>> import urllib
>>> import xml.etree.ElementTree as ET
>>>
>>> f = urllib.urlopen(url)
>>> tree = ET.parse(f)
>>> [e.text for e in tree.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % 1)]
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
答案 2 :(得分:0)
我试过你的例子,但它确实有效。这是一个精简的完整版本:
import urllib
import xml.etree.ElementTree as ET
url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
tree = ET.parse(urllib.urlopen(url))
seasons = tree.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % '1')
for s in seasons:
print s.text
我能想到的唯一问题是,你以某种方式下载了一个部分XML文档 - 不太可能,但我不知道任何其他解释。请注意,上述脚本取自您的问题。我只添加了for
循环。
答案 3 :(得分:0)
import xml.etree.ElementTree as ET
import urllib
content = urllib.urlopen(url).read()
tree2 = ET.fromstring(content)
tvrage_seasons = tree2.findall('.//Season' )
由于某种原因,在xbmc Elementtree中必须有一个错误或某些东西不能使它工作。但这对我来说很有用!