使用Python验证XML节点结构

时间:2018-03-06 15:06:37

标签: python xml elementtree

我有文件:



<?xml version='1.0' encoding='UTF-8'?>
<AUTOSAR xmlns="http://autosar.org/schema/r4.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://autosar.org/schema/r4.0 AUTOSAR_4-2-2_STRICT_COMPACT.xsd">
    <AR-PACKAGES>
        <AR-PACKAGE>
            <SHORT-NAME>RootP_Composition</SHORT-NAME>
            <COMPOSITION-SW-COMPONENT-TYPE>
                <SHORT-NAME>Compo_VSM</SHORT-NAME>
                <CONNECTORS>
                    <ASSEMBLY-SW-CONNECTOR>
                        <SHORT-NAME>PP_CS_VehicleSPeed_ASWC_M6_to_ASWC_M740</SHORT-NAME>
                        <PROVIDER-IREF>
                            <CONTEXT-COMPONENT-REF DEST="SW-COMPONENT-PROTOTYPE">/RootP_Composition/Compo_VSM/Instance_ASWC_M6</CONTEXT-COMPONENT-REF>
                            <TARGET-P-PORT-REF DEST="P-PORT-PROTOTYOPE">/RootP_ASWC_M6/ASWC_M6/PP_CS_VehicleSPeed</TARGET-P-PORT-REF>
                        </PROVIDER-IREF>
                        <REQUESTER-IREF>
                            <CONTEXT-COMPONENT-REF DEST="SW-COMPONENT-PROTOTYPE">/RootP_Composition/Compo_VSM/Instance_ASWC_M740</CONTEXT-COMPONENT-REF>
                            <TARGET-R-PORT-REF DEST="R-PORT-PROTOTYOPE">/RootP_ASWC_M740/ASWC_M740/RP_CS_VehicleSPeed</TARGET-R-PORT-REF>
                        </REQUESTER-IREF>
                    </ASSEMBLY-SW-CONNECTOR>
                </CONNECTORS>
            </COMPOSITION-SW-COMPONENT-TYPE>
        </AR-PACKAGE>
    </AR-PACKAGES>
</AUTOSAR>
&#13;
&#13;
&#13;

我想检查ASSEMBLY-SW-CONNECTOR节点是否有小孩SHORT-NAMEPROVIDER-IREFREQUESTER-IREF以及PROVIDER-IREF/REQUESTER-IREF是否为小孩(孙子为ASSEMBLY-SW-CONNECTORCONTEXT-COMPONENT-REFTARGET-P-PORT-REF/CONTEXT-COMPONENT-REF以及TARGET-R-PORT-REF

到目前为止,我有这段代码:

tree = ET.parse('C:\\test\Abu\TRS.ABU.GEN.002\output\Connectors.arxml')
root = tree.getroot()
child = ["SHORT-NAME", "PROVIDER-IREF", "REQUESTER-IREF"]
grandchild = ["CONTEXT-COMPONENT-REF", "TARGET-P-PORT-REF", "CONTEXT-COMPONENT-REF", "TARGET-R-PORT-REF"]
connector = '{http://autosar.org/schema/r4.0}ASSEMBLY-SW-CONNECTOR'
for element in root.iter(tag = connector):
    for child in element:
        for grandchild in child:
            if child.tag.split('}', 1)[1] in child:
                if grandchild.tag.split('}', 1)[1] in grandchild:
                    print("yes")
                else:
                    print("No")

我哪里错了?提前谢谢!

更新1

&#13;
&#13;
tree = etree.parse('C:\\test\Abu\TRS.ABU.GEN.002\output\Connectors.arxml')
root = tree.getroot()
found_name = found_provider = found_requester = found_contextP = found_targetP = found_contextR =found_targetR = False
connectors =  root.findall(".//{http://autosar.org/schema/r4.0}ASSEMBLY-SW-CONNECTOR>")
for elem in connectors:
    if elem.find(".//{http://autosar.org/schema/r4.0}SHORT-NAME>"):
        found_name = True
    if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
        found_provider = True
        for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
            if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
                found_contextR = True
            if child.find(".//{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF>"):
                found_targetP = True
    if elem.find(".//{http://autosar.org/schema/r4.0}REQUESTER-IREF>"):
        found_requester = True
        for child in elem.find(".//{http://autosar.org/schema/r4.0}REQUESTER-IREF>"):
            if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
                found_contextR = True
            if child.find(".//{http://autosar.org/schema/r4.0}TARGET-R-PORT-REF>"):
                found_targetR = True

if found_name and found_provider and found_requester and found_contextP and found_targetP and found_contextR and found_targetR:
    print("True")
else:
    print("False")
&#13;
&#13;
&#13;

我知道为什么会得到假结果?

1 个答案:

答案 0 :(得分:0)

两个问题:

首先,一些错别字/小错误:

  • 您在所有查找路径中都有一个不必要的结束标记(>),因此它们都需要删除
  • 在您的found_provider部分,当我认为您的意思是found_contextR时,设置found_contextP P ,而不是 R )< / LI>
  • 使用

    if elem.find("<path>"):
    

    提出警告,你应该改为使用

    if elem.find("<path>") is not None:
    

其次,您在child元素部分中犯了一个错误,例如found_provider部分:

if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
    found_provider = True
    for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
        if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
            found_contextR = True
        if child.find(".//{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF>"):
            found_targetP = True

您正确找到了PROVIDER-IREF元素,然后您试图匹配CONTEXT-COMPONENT-REFTARGET-P-PORT-REF元素。但是你通过将它们作为这些子元素(即PROVIDER-IREF的孙子)的孩子来搜索它们,当它们自己 孩子时。

因此,您需要检查子元素的标记,而不是搜索其下的元素:

if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF") is not None:
    found_provider = True
    for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF"):
        if child.tag == "{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF":
            found_contextP = True
        if child.tag == "{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF":
            found_targetP = True

或者您可以尝试提取PROVIDER-IREF元素,然后在其下找到元素:

provider = elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF")
if provider is not None:
    found_provider = True
    if provider.find("{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF") is not None:
        found_contextP = True
    if provider.find("{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF") is not None:
        found_targetP = True

显然,对found_requester部分做同样的事情。

我认为你最初的做法实际上非常好;尝试指定子孙结构,然后检查它是否适合XML。但是你需要指定哪些孙子属于哪个孩子,所以也许可以使用这样的嵌套词典:

structure = {
    "ASSEMBLY-SW-CONNECTOR": {
        "SHORT-NAME": None,
        "PROVIDER-IREF": {
            "CONTEXT-COMPONENT-REF": None,
            "TARGET-P-PORT-REF": None
            }
        "REQUESTER-IREF": {
            "CONTEXT-COMPONENT-REF": None,
            "TARGET-R-PORT-REF": None
            }
        }
    }

然后有一个递归函数(即一个自己调用的函数)来搜索匹配的子项,直到它到达None并停止向下看那个分支。