如果缺少属性,则跳过XML标记

时间:2016-08-21 20:33:03

标签: python xml

我正在尝试从下面的XML文件中获取数据。对于每种类型,解释应紧挨着它。

例如:

橙色它们属于柑橘类。它们不能在低于温度下生长 柠檬它们属于柑橘。它们不能在低于

的温度下生长
<Fruits>
    <Fruit>
        <Family>Citrus</Family>
        <Explanation>They belong to the Citrus.They cannot grow at a temperature below</Explanation>
        <Type>Orange</Type>
        <Type>Lemon</Type>
        <Type>Lime</Type>
        <Type>Grapefruit</Type>
    </Fruit>
        <Fruit>
        <Family>Pomes</Family>
        <Type>Apple</Type>
        <Type>Pear</Type>        
    </Fruit>
</Fruits>

这适用于下面的代码。然而,对于第二个Fruit Family我有一个问题,因为没有解释。

import os
from xml.etree import ElementTree
file_name = "example.xml"
full_file = os.path.abspath(os.path.join("xml", file_name))
dom = ElementTree.parse(full_file)
Fruit = dom.findall("Fruit")

for f in Fruit:
    Explanation = f.find("Explanation").text
    Types = f.findall("Type")
    for t in Types:
       Type = t.text
       print ("{0}, {1}".format(Type, Explanation))

如果缺少属性说明,我怎么能跳过像Fruit Family(Pomes)这样的标签?

1 个答案:

答案 0 :(得分:2)

使用 xml.etree ,只需尝试找到说明子项:

from  xml.etree import ElementTree as et
root = et.fromstring(xml)

for node in root.iter("Fruit"):
    if node.find("Explanation") is not None:
        print(node.find("Family").text)

您也可以使用xpath,只有在使用lxml说明子项时才能获得Fruit节点:

import lxml.etree as et

root = et.fromstring(xml)

for node in root.xpath("//Fruit[Explanation]"):
     print(node.xpath("Family/text()"))

如果我们在你的样品上运行它,你会发现我们只是得到柑橘:

In [1]: xml = """<Fruits>
   ...:     <Fruit>
   ...:         <Family>Citrus</Family>
   ...:         <Explanation>They belong to the Citrus.They cannot grow at a temperature below</Explanation>
   ...:         <Type>Orange</Type>
   ...:         <Type>Lemon</Type>
   ...:         <Type>Lime</Type>
   ...:         <Type>Grapefruit</Type>
   ...:     </Fruit>
   ...:         <Fruit>
   ...:         <Family>Pomes</Family>
   ...:         <Type>Apple</Type>
   ...:         <Type>Pear</Type>
   ...:     </Fruit>
   ...: </Fruits>"""


In [2]: import lxml.etree as et

In [3]: root = et.fromstring(xml)

In [4]: for node in root.xpath("//Fruit[Explanation]"):
   ...:         print(node.xpath("Family/text()"))
   ...:     
['Citrus']