ElementTree XML API与子元素

时间:2016-12-09 16:29:17

标签: python xml parsing elementtree

我正在尝试使用USPS API来返回包跟踪的状态。我有一个方法返回一个ElementTree.Element对象,该对象是从USPS API返回的XML字符串构建的。

这是返回的XML字符串。

<?xml version="1.0" encoding="UTF-8"?>
  <TrackResponse>
    <TrackInfo ID="EJ958088694US">
      <TrackSummary>The Postal Service could not locate the tracking information for your 
       request. Please verify your tracking number and try again later.</TrackSummary>
    </TrackInfo>
  </TrackResponse>

我将其格式化为Element对象

response = xml.etree.ElementTree.fromstring(xml_str)

现在我可以在xml字符串中看到标签&#39; TrackSummary&#39;存在,我希望能够使用ElementTree的find方法访问它。

作为额外的证明,我可以迭代响应对象并证明&#39; TrackSummary&#39;标签存在。

for item in response.iter():
    print(item, item.text)

返回:

<Element 'TrackResponse' at 0x00000000041B4B38> None
<Element 'TrackInfo' at 0x00000000041B4AE8> None
<Element 'TrackSummary' at 0x00000000041B4B88> The Postal Service could not locate the tracking information for your request. Please verify your tracking number and try again later.

所以这就是问题所在。

print(response.find('TrackSummary')

返回

None

我在这里遗漏了什么吗?好像我应该能够找到没有问题的子元素?

2 个答案:

答案 0 :(得分:1)

.find()方法仅搜索下一层,而不是递归搜索。要递归搜索,您需要使用XPath查询。在XPath中,双斜杠//是递归搜索。试试这个:

# returns a list of elements with tag TrackSummary
response.xpath('//TrackSummary')

# returns a list of the text contained in each TrackSummary tag
response.xpath('//TrackSummary/node()')

答案 1 :(得分:1)

import xml.etree.cElementTree as ET # 15 to 20 time faster

response = ET.fromstring(str)

Xpath Syntax 选择所有子元素。例如,* / egg选择所有名为egg的孙子。

element = response.findall('*/TrackSummary') # you will get a list
print element[0].text #fast print else iterate the list

>>> The Postal Service could not locate the tracking informationfor your request. Please verify your tracking number and try again later.