我正在尝试从下面的XML文件中获取数据。对于每种类型,解释应紧挨着它。
例如:
橙色它们属于柑橘类。它们不能在低于温度下生长 柠檬它们属于柑橘。它们不能在低于的温度下生长<Fruits>
<Fruit>
<Family>Citrus</Family>
<Explanation>They belong to the Citrus.They cannot grow at a temperature below</Explanation>
<Type>Orange</Type>
<Type>Lemon</Type>
<Type>Lime</Type>
<Type>Grapefruit</Type>
</Fruit>
<Fruit>
<Family>Pomes</Family>
<Type>Apple</Type>
<Type>Pear</Type>
</Fruit>
</Fruits>
这适用于下面的代码。然而,对于第二个Fruit Family我有一个问题,因为没有解释。
import os
from xml.etree import ElementTree
file_name = "example.xml"
full_file = os.path.abspath(os.path.join("xml", file_name))
dom = ElementTree.parse(full_file)
Fruit = dom.findall("Fruit")
for f in Fruit:
Explanation = f.find("Explanation").text
Types = f.findall("Type")
for t in Types:
Type = t.text
print ("{0}, {1}".format(Type, Explanation))
如果缺少属性说明,我怎么能跳过像Fruit Family(Pomes)这样的标签?
答案 0 :(得分:2)
使用 xml.etree ,只需尝试找到说明子项:
from xml.etree import ElementTree as et
root = et.fromstring(xml)
for node in root.iter("Fruit"):
if node.find("Explanation") is not None:
print(node.find("Family").text)
您也可以使用xpath,只有在使用lxml的说明子项时才能获得Fruit节点:
import lxml.etree as et
root = et.fromstring(xml)
for node in root.xpath("//Fruit[Explanation]"):
print(node.xpath("Family/text()"))
如果我们在你的样品上运行它,你会发现我们只是得到柑橘:
In [1]: xml = """<Fruits>
...: <Fruit>
...: <Family>Citrus</Family>
...: <Explanation>They belong to the Citrus.They cannot grow at a temperature below</Explanation>
...: <Type>Orange</Type>
...: <Type>Lemon</Type>
...: <Type>Lime</Type>
...: <Type>Grapefruit</Type>
...: </Fruit>
...: <Fruit>
...: <Family>Pomes</Family>
...: <Type>Apple</Type>
...: <Type>Pear</Type>
...: </Fruit>
...: </Fruits>"""
In [2]: import lxml.etree as et
In [3]: root = et.fromstring(xml)
In [4]: for node in root.xpath("//Fruit[Explanation]"):
...: print(node.xpath("Family/text()"))
...:
['Citrus']