我正在尝试查询xml文档以打印出与较低级元素相关联的较高级元素属性。我得到的结果与xml结构不符。基本上,这是我到目前为止的代码。
d = {'vash': 1, 'the': 5, 'stampede': 12}
new_d = dict(sorted(d.items(), key=lambda x: x[1], reverse = True))
# {'stampede': 12, 'the': 5, 'vash': 1}
哪个产生这个-
import xml.etree.ElementTree as ET
tree = ET.parse('movies2.xml') root = tree.getroot()
for child in root:
print(child.tag, child.attrib) print()
mov = root.findall("./genre/decade/movie/[year='2000']")
for movie in mov:
print(child.attrib['category'], movie.attrib['title'])
如果检查xml-
,最后两行实际上应该列出与电影标题相关的两种不同流派属性genre {'category': 'Action'}
genre {'category': 'Thriller'}
genre {'category': 'Comedy'}
Comedy X-Men
Comedy American Psycho
这是供参考的xml-
Action X-Men Thriller American Psycho
答案 0 :(得分:1)
您的初始循环:
for child in root:
print(child.tag, child.attrib) print()
将child
留给最后一个孩子;因此child.attrib['category']
将永远是最后一个孩子的类别。就您而言,最后一个孩子是喜剧。对于第二个循环中的每部电影:
for movie in mov:
print(child.attrib['category'], movie.attrib['title'])
您正在打印第一个循环中找到的最后一个孩子的类别;所以他们都打印“喜剧”。
编辑:这将至少选择具有正确流派标签的相同电影,但顺序可能不同:
for child in root:
mov = child.findall("./decade/movie/[year='2000']")
for movie in mov:
print(child.attrib['category'], movie.attrib['title'])
另一种方法,使用lxml代替elementree:
from lxml import etree as ET
tree = ET.parse('movies2.xml')
root = tree.getroot()
mov = root.findall("./genre/decade/movie/[year='2000']")
for movie in mov:
print(movie.getparent().getparent().attrib['category'], movie.attrib['title'])