删除XML中的容器

时间:2018-11-27 18:18:02

标签: python xml elementtree

这是我的输入文件。故意忽略标题,因为我认为它们与问题无关。我没有粘贴整个文件,因为它很大。我只添加了两个容器:

  <ECUC-CONTAINER-VALUE>
     <SHORT-NAME>ABC</SHORT-NAME>
     <DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject</DEFINITION-REF>
     <PARAMETER-VALUES>
       <ECUC-NUMERICAL-PARAM-VALUES>
         <DEFINITION-REF DEST="ECUC-INTEGER-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue</DEFINITION-REF>
          <VALUE>1053</VALUE>
       </ECUC-NUMERICAL-PARAM-VALUES>
       <ECUC-TEXTUAL-PARAM-VALUES>
       <DEFINITION-REF DEST="ECUC-ENUMERATION-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANHandleType</DEFINITION-REF>
           <VALUE>TRUE</VALUE>
        </ECUC-TEXTUAL-PARAM-VALUES>
      </PARAMETER-VALUES>        
<ECUC-CONTAINER-VALUE>

    <ECUC-CONTAINER-VALUE>
     <SHORT-NAME>ABC</SHORT-NAME>
     <DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject</DEFINITION-REF>
     <PARAMETER-VALUES>
       <ECUC-NUMERICAL-PARAM-VALUES>
         <DEFINITION-REF DEST="ECUC-INTEGER-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue</DEFINITION-REF>
          <VALUE>1054</VALUE>
       </ECUC-NUMERICAL-PARAM-VALUES>
       <ECUC-TEXTUAL-PARAM-VALUES>
       <DEFINITION-REF DEST="ECUC-ENUMERATION-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANHandleType</DEFINITION-REF>
           <VALUE>FALSE</VALUE>
        </ECUC-TEXTUAL-PARAM-VALUES>
 </PARAMETER-VALUES>        
<ECUC-CONTAINER-VALUE>

我的xml中大约有100个<ECUC-CONTAINER-VALUE>标签。如果下面的<ECUC-NUMERICAL-PARAM-VALUES>容器的文本是<DEFINITION-REF DEST="ECUC-INTEGER-PARAM-DEF">,我必须删除/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue容器 但是我没有得到结果。请帮忙。

我写的脚本:

import xml.etree.ElementTree
tree = ET.parse('a.xml')
root = tree.getroot()

for child in root.findall(".//ECUC-NUMERICAL-PARAM-VALUE"):
    for gchild in child.findall(".//DEFINITION-REF [@DEST='ECUC-INTEGER-PARAM-DEF']"):
         string = gchild.find("VALUE").text
         if string == "/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue":
             root.remove(child)

1 个答案:

答案 0 :(得分:1)

如果要删除ECUC-NUMERICAL-PARAM-VALUES,则需要选择其父项。因此,请尝试从PARAMETER-VALUES级别进行迭代。

示例...

XML输入a.xml;已更新为格式正确的文件)

<doc>
    <ECUC-CONTAINER-VALUE>
        <SHORT-NAME>ABC</SHORT-NAME>
        <DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject</DEFINITION-REF>
        <PARAMETER-VALUES>
            <ECUC-NUMERICAL-PARAM-VALUES>
                <DEFINITION-REF DEST="ECUC-INTEGER-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue</DEFINITION-REF>
                <VALUE>1053</VALUE>
            </ECUC-NUMERICAL-PARAM-VALUES>
            <ECUC-TEXTUAL-PARAM-VALUES>
                <DEFINITION-REF DEST="ECUC-ENUMERATION-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANHandleType</DEFINITION-REF>
                <VALUE>TRUE</VALUE>
            </ECUC-TEXTUAL-PARAM-VALUES>
        </PARAMETER-VALUES>
    </ECUC-CONTAINER-VALUE>
    <ECUC-CONTAINER-VALUE>
        <SHORT-NAME>ABC</SHORT-NAME>
        <DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject</DEFINITION-REF>
        <PARAMETER-VALUES>
            <ECUC-NUMERICAL-PARAM-VALUES>
                <DEFINITION-REF DEST="ECUC-INTEGER-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue</DEFINITION-REF>
                <VALUE>1054</VALUE>
            </ECUC-NUMERICAL-PARAM-VALUES>
            <ECUC-TEXTUAL-PARAM-VALUES>
                <DEFINITION-REF DEST="ECUC-ENUMERATION-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANHandleType</DEFINITION-REF>
                <VALUE>FALSE</VALUE>
            </ECUC-TEXTUAL-PARAM-VALUES>
        </PARAMETER-VALUES>
    </ECUC-CONTAINER-VALUE>
</doc>

Python

import xml.etree.ElementTree as ET

tree = ET.parse('a.xml')

for p_vals in tree.findall(".//PARAMETER-VALUES"):
    for num_p_vals in p_vals.findall(".//ECUC-NUMERICAL-PARAM-VALUES"):
        def_ref = num_p_vals.find("DEFINITION-REF[@DEST='ECUC-INTEGER-PARAM-DEF']")
        if def_ref is not None and def_ref.text == \
                "/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue":
            p_vals.remove(num_p_vals)

ET.dump(tree)

输出

<doc>
    <ECUC-CONTAINER-VALUE>
        <SHORT-NAME>ABC</SHORT-NAME>
        <DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject</DEFINITION-REF>
        <PARAMETER-VALUES>
            <ECUC-TEXTUAL-PARAM-VALUES>
                <DEFINITION-REF DEST="ECUC-ENUMERATION-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANHandleType</DEFINITION-REF>
                <VALUE>TRUE</VALUE>
            </ECUC-TEXTUAL-PARAM-VALUES>
        </PARAMETER-VALUES>
    </ECUC-CONTAINER-VALUE>
    <ECUC-CONTAINER-VALUE>
        <SHORT-NAME>ABC</SHORT-NAME>
        <DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject</DEFINITION-REF>
        <PARAMETER-VALUES>
            <ECUC-TEXTUAL-PARAM-VALUES>
                <DEFINITION-REF DEST="ECUC-ENUMERATION-PARAM-DEF">/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANHandleType</DEFINITION-REF>
                <VALUE>FALSE</VALUE>
            </ECUC-TEXTUAL-PARAM-VALUES>
        </PARAMETER-VALUES>
    </ECUC-CONTAINER-VALUE>
</doc>

如果您能够使用lxml,它比ElementTree具有更好的XPath支持。您还可以使用getparent()访问父元素。在我看来,这简化了答案。

示例...(与上述相同的输入产生与上述相同的输出)

from lxml import etree

tree = etree.parse('a.xml')

for num_p_vals in tree.xpath(".//ECUC-NUMERICAL-PARAM-VALUES[DEFINITION-REF[@DEST='ECUC-INTEGER-PARAM-DEF']='/AUTOSAR_CAN/EcucModuleDefs/CanConfigSet/CanHardwareObject/CANIdValue']"):
    num_p_vals.getparent().remove(num_p_vals)

etree.dump(tree.getroot())