Question

我在python 2.7.15上使用etree，并且卡住了，我试图解析XML文件以从中获取值，如下面的代码所示：

# -*- coding: utf-8 -*-

import xml.etree.ElementTree as etree

def XMLParse(filename):
   filename = filename
   tree = etree.parse(filename)
   beans = tree.findall('{http://www.speedframework.org/schema/beans}bean')

   for bean in beans:
     for property in bean:

        if "name" in property.attrib and "value" in property.attrib:
            print ("This one catches PROP1:" + property.attrib['name'])
            print property.attrib

        if "name" in property.attrib and not "value" in property.attrib:
            for util in property.iter():
                for lists in util:
                    for parameter in lists:


                        if 'key' in parameter.attrib:
                            print ("This one catches PROP3:" + parameter.attrib['key'])

                        if 'bean' in parameter.attrib:
                            print ("This one catches PROP4:" + parameter.attrib['bean'])

                        if 'value' in parameter.attrib:
                            print ("This one should catch PROP2:" + parameter.attrib['value'])
                        print parameter.attrib


filename = open('static/test1.xml')
XMLParse(filename)

这是我的xml：

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.speedframework.org/schema/beans"
xmlns:cxf="http://cxf.apache.org/core" 
xmlns:jaxws="http://cxf.apache.org/jaxws"
xmlns:test="http://apache.org/hello_world_soap_http" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.speedframework.org/schema/util" 
xmlns:http="http://cxf.apache.org/transports/http/configuration"
xmlns:sec="http://cxf.apache.org/configuration/security"
xmlns:context="http://www.speedframework.org/schema/context"
xsi:schemaLocation="
http://cxf.apache.org/core
http://cxf.apache.org/schemas/core.xsd
http://www.speedframework.org/schema/beans
http://www.speedframework.org/schema/beans/speed-beans-2.0.xsd
http://www.speedframework.org/schema/context
http://www.speedframework.org/schema/context/speed-context-3.0.xsd
http://cxf.apache.org/jaxws
http://cxf.apache.org/schemas/jaxws.xsd
http://www.speedframework.org/schema/util
http://www.speedframework.org/schema/util/speed-util-2.0.xsd
http://cxf.apache.org/transports/http/configuration
http://cxf.apache.org/schemas/configuration/http-conf.xsd
http://cxf.apache.org/configuration/security
http://cxf.apache.org/schemas/configuration/security.xsd">

<context:property-placeholder location="classpath:realm.properties"/>

<bean id="FOO" class="BAR">
    <property name="Prop1" value="ValueProp1" />
    <property name="Prop2">
        <util:list>
            <value>PropValue2A</value>
            <value>PropValue2B</value>
        </util:list>
    </property>
    <property name="Prop3">
        <util:map>
            <entry key="Prop3Key" value-ref="Prop3Value" />
        </util:map>
    </property>
    <property name="Prop4">
        <util:list>
            <ref bean="Prop4" />
        </util:list>
    </property>
</bean>
</beans>

您可以看到prop1，prop3和prop 4解析正常。问题是 prop2 ，当我尝试获取属性时，我得到的只是 {} {} 两个空括号。我的真实xml更大，这就是为什么我使用循环的原因。但是也许我在考虑使用xpath也许还有更好的解决方案？

输出：

This one catches PROP1:Prop1
{'name': 'Prop1', 'value': 'ValueProp1'}
{}
{}
This one catches PROP3:Prop3Key
{'value-ref': 'Prop3Value', 'key': 'Prop3Key'}
This one catches PROP4:Prop4
{'bean': 'Prop4'}

主要问题：：如何从util：list中获取所有“ prop2”？

Answer 1

if 'value' in parameter.attrib:

这对我来说并不正确。假设parameter引用了<value>PropValue2A</value>元素。该标记是一个值元素，但没有value属性。如果确实如此，它将看起来像：

<value value=whatever>PropValue2A</value>

在这种情况下，我认为您想检查元素的名称，而不是属性。

for parameter in lists:


    if 'key' in parameter.attrib:
        print ("This one catches PROP3:" + parameter.attrib['key'])

    if 'bean' in parameter.attrib:
        print ("This one catches PROP4:" + parameter.attrib['bean'])

    if 'value' in parameter.tag:
        print ("This one should catch PROP2:" + parameter.tag)
    print parameter.attrib

现在，您的第三个条件在遍历Prop2时将通过两次：

This one catches PROP1:Prop1
{'name': 'Prop1', 'value': 'ValueProp1'}
This one should catch PROP2:{http://www.speedframework.org/schema/beans}value
{}
This one should catch PROP2:{http://www.speedframework.org/schema/beans}value
{}
This one catches PROP3:Prop3Key
{'value-ref': 'Prop3Value', 'key': 'Prop3Key'}
This one catches PROP4:Prop4
{'bean': 'Prop4'}

此外，我认为您的代码中有太多的for循环。您有五个for循环，但是bean元素的内容最多具有四个标记深度。无论如何，您都会得到明智的输出，因为property.iter遍历了属于该属性树的所有节点（包括其自身），因此在某些情况下该循环可以有效地消除。但是，您可以通过仅迭代property的直接后代并跳过其中一个循环来简化操作。

import xml.etree.ElementTree as etree

def XMLParse(filename):
   filename = filename
   tree = etree.parse(filename)
   beans = tree.findall('{http://www.speedframework.org/schema/beans}bean')

   for bean in beans:
     for property in bean:

        if "name" in property.attrib and "value" in property.attrib:
            print ("This one catches PROP1:" + property.attrib['name'])
            print property.attrib

        if "name" in property.attrib and not "value" in property.attrib:
            for util in property:
                for parameter in util:
                    if 'key' in parameter.attrib:
                        print ("This one catches PROP3:" + parameter.attrib['key'])

                    if 'bean' in parameter.attrib:
                        print ("This one catches PROP4:" + parameter.attrib['bean'])

                    if 'value' in parameter.tag:
                        print ("This one should catch PROP2:" + parameter.tag)
                    print parameter.attrib


filename = open('data.xml')
XMLParse(filename)

您仍然应该获得相同的输出，并且会更快一些。

This one catches PROP1:Prop1
{'name': 'Prop1', 'value': 'ValueProp1'}
This one should catch PROP2:{http://www.speedframework.org/schema/beans}value
{}
This one should catch PROP2:{http://www.speedframework.org/schema/beans}value
{}
This one catches PROP3:Prop3Key
{'value-ref': 'Prop3Value', 'key': 'Prop3Key'}
This one catches PROP4:Prop4
{'bean': 'Prop4'}

在Python 2.7.15中使用ElementTree从util：list中获取值

1 个答案: