使用Python访问XML元素值

时间:2018-01-18 15:12:07

标签: python xml parsing

我是Python的新手,所以如果我之前已经讨论过这个问题我很抱歉并且我太无知了应用解决方案。

这是XML:

<?xml version="1.0" encoding="UTF-8"?>
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <modelVersion>4.0.0</modelVersion>
  <groupId>DEFAULT</groupId>
  <artifactId>ADP_ServiceTechnology-JRG_Testing</artifactId>
  <version>2.0.31</version>
  <dependencies>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>ADP Standard Operations</artifactId>
      <version>2.2.86.17-SNAPSHOT</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Base</artifactId>
      <version>1.9.0-SNAPSHOT</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Databases</artifactId>
      <version>[1.1.0]</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>HPE Solutions</artifactId>
      <version>[1.8.2]</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Business Applications</artifactId>
      <version>[1.3.0]</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Operating Systems</artifactId>
      <version>[1.3.0]</version>
    </dependency>
  </dependencies>
</project>

我成功导入数据:

import xml.etree.ElementTree as ET
tree = ET.parse('pom.xml')
root = tree.getroot()

我只需要遍历树并检索<artifactId><version>值。我已尝试在网上找到许多方法,没有运气。使用php和xpath对我来说很简单,但是python让我感到难过。

此:

for elem in tree.iter():
  print "%s: '%s'" % (elem.tag, elem.text)  

将返回每个元素标签和文本,但我想导航到我指示的两个。

提前致谢!

1 个答案:

答案 0 :(得分:0)

如果您在名称空间前添加,则可以找到它们:

import xml.etree.ElementTree as ET
tree = ET.parse("t.xml")
root = tree.getroot()

namesp = root.tag.replace("project","")  # get the namesapce from the root key

version = root.find(namesp+"version")
artifactId = root.find(namesp+"artifactId")

print(version.text)
print(artifactId.text)

输出:

2.0.31
ADP_ServiceTechnology-JRG_Testing 

您可以在此处找到更多常规信息:Parsing XML in Python using ElementTree example

并在doku https://docs.python.org/3/library/xml.etree.elementtree.html(切换到页面顶部的python 2)

如果要从数据中删除命名空间,请参阅https://stackoverflow.com/a/25920989/7505395中的Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"