如何使用python处理xml中的键值样式标记

时间:2015-01-04 02:00:27

标签: python xml

使用以下XML文件:

<?xml version="1.0" encoding="UTF-8"?>
<Environment
     xmlns="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:ve="http://www.vmware.com/schema/ovfenv"
     oe:id=""
     ve:vCenterId="vm-61">
   <PlatformSection>
      <Kind>VMware ESXi</Kind>
      <Version>5.5.0</Version>
      <Vendor>VMware, Inc.</Vendor>
      <Locale>en</Locale>
   </PlatformSection>
   <PropertySection>
         <Property oe:key="ppEnv" oe:value="production"/>
         <Property oe:key="pphostname" oe:value="coolhostname"/>
   </PropertySection>
   <ve:EthernetAdapterSection>
      <ve:Adapter ve:mac="00:50:56:94:9a:56" ve:network="Service" ve:unitNumber="7"/>
   </ve:EthernetAdapterSection>
</Environment>

我想获得oe:key "pphostname"的价值,但我找不到明确的方法来实现这一目标。

我是python和xml的新手,我尝试的所有内容都是在python中:

>> import libxml2
>>> doc = libxml2.parseFile("test.xml")
>>> doc.xpathEval("//Property/*")
[]
>>> doc.xpathEval("//Property/@*")
[]
>>> doc.xpathEval("//Property")
[]
>>> doc.xpathEval("//*")
[<xmlNode (Environment) object at 0x7fb551e8e320>, <xmlNode (PlatformSection) object at 0x7fb551eb3a28>, <xmlNode (Kind) object at 0x7fb551daa950>, <xmlNode (Version) object at 0x7fb551daa998>, <xmlNode (Vendor) object at 0x7fb551daa9e0>, <xmlNode (Locale) object at 0x7fb551daaa28>, <xmlNode (PropertySection) object at 0x7fb551daaa70>, <xmlNode (Property) object at 0x7fb551daaab8>, <xmlNode (Property) object at 0x7fb551daab00>, <xmlNode (EthernetAdapterSection) object at 0x7fb551daab48>, <xmlNode (Adapter) object at 0x7fb551daab90>]
>>> doc.xpathEval("/Environment/PropertySection/Property[1]")
[]
>>> doc.xpathEval("/Environment/PropertySection/Property/oe:key")
Undefined namespace prefix

我对bash更熟悉,但我不想使用bash实用程序解析。

3 个答案:

答案 0 :(得分:1)

您可以为命名空间(oe)指定名称,并匹配其属性中的键值对。

此处示例我使用 xml 模块:

import xml.etree.ElementTree as ET

s = '''<?xml version="1.0" encoding="UTF-8"?>
<Environment
     xmlns="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:ve="http://www.vmware.com/schema/ovfenv"
     oe:id=""
     ve:vCenterId="vm-61">
   <PlatformSection>
      <Kind>VMware ESXi</Kind>
      <Version>5.5.0</Version>
      <Vendor>VMware, Inc.</Vendor>
      <Locale>en</Locale>
   </PlatformSection>
   <PropertySection>
         <Property oe:key="ppEnv" oe:value="production"/>
         <Property oe:key="pphostname" oe:value="coolhostname"/>
   </PropertySection>
   <ve:EthernetAdapterSection>
      <ve:Adapter ve:mac="00:50:56:94:9a:56" ve:network="Service" ve:unitNumber="7"/>
   </ve:EthernetAdapterSection>
</Environment>'''

tree = ET.fromstring(s)
oe = '{http://schemas.dmtf.org/ovf/environment/1}'

for node in tree.iter(oe+'Property'):
    if node.attrib[oe+'key'] == 'pphostname':
        print node.attrib[oe+'value']

结果:

coolhostname

答案 1 :(得分:1)

尝试使用xml.dom.minidom

from xml.dom import minidom

xml_doc = minidom.parse('test.xml')
property_items = xml_doc.getElementsByTagName("Property")

condition = lambda x: x.hasAttribute('oe:key') and 
                      x.attributes['oe:key'].value == "pphostname"

matched_elements = [x for x in property_items if condition(x)]

if matched_elements:
    matched_element = matched_elements[0]
    print matched_element.attributes['oe:value'].value

答案 2 :(得分:0)

检查this document(第6.2节:命名空间默认)。在您的xml中,有一个默认命名空间(xmlns =&#34; http://schemas.dmtf.org/ovf/environment/1")。所以我认为我们需要在xpath中添加默认命名空间。下面是测试代码,带有lxml libary(libxml2应该类似)。

from lxml import etree
from StringIO import StringIO

s = '''<?xml version="1.0" encoding="UTF-8"?>
<Environment
     xmlns="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:ve="http://www.vmware.com/schema/ovfenv"
     oe:id=""
     ve:vCenterId="vm-61">
   <PlatformSection>
      <Kind>VMware ESXi</Kind>
      <Version>5.5.0</Version>
      <Vendor>VMware, Inc.</Vendor>
      <Locale>en</Locale>
   </PlatformSection>
   <PropertySection>
         <Property oe:key="ppEnv" oe:value="production"/>
         <Property oe:key="pphostname" oe:value="coolhostname"/>
   </PropertySection>
   <ve:EthernetAdapterSection>
      <ve:Adapter ve:mac="00:50:56:94:9a:56" ve:network="Service" ve:unitNumber="7"/>
   </ve:EthernetAdapterSection>
</Environment>'''

f = StringIO(s)
tree = etree.parse(f)

namespaces={'oe': 'http://schemas.dmtf.org/ovf/environment/1', 'xsi': 'http://www.w3.org/2001/XMLSchema-instance', 've': 'http://www.vmware.com/schema/ovfenv'}   

print tree.xpath('//oe:Property[@oe:key="pphostname"]/@oe:value', namespaces=namespaces)
#output ['coolhostname']