Question

我正在尝试使用USPS API来返回包跟踪的状态。我有一个方法返回一个ElementTree.Element对象，该对象是从USPS API返回的XML字符串构建的。

这是返回的XML字符串。

<?xml version="1.0" encoding="UTF-8"?>
  <TrackResponse>
    <TrackInfo ID="EJ958088694US">
      <TrackSummary>The Postal Service could not locate the tracking information for your 
       request. Please verify your tracking number and try again later.</TrackSummary>
    </TrackInfo>
  </TrackResponse>

我将其格式化为Element对象

response = xml.etree.ElementTree.fromstring(xml_str)

现在我可以在xml字符串中看到标签＆＃39; TrackSummary＆＃39;存在，我希望能够使用ElementTree的find方法访问它。

作为额外的证明，我可以迭代响应对象并证明＆＃39; TrackSummary＆＃39;标签存在。

for item in response.iter():
    print(item, item.text)

返回：

<Element 'TrackResponse' at 0x00000000041B4B38> None
<Element 'TrackInfo' at 0x00000000041B4AE8> None
<Element 'TrackSummary' at 0x00000000041B4B88> The Postal Service could not locate the tracking information for your request. Please verify your tracking number and try again later.

所以这就是问题所在。

print(response.find('TrackSummary')

返回

None

我在这里遗漏了什么吗？好像我应该能够找到没有问题的子元素？

Answer 1

.find()方法仅搜索下一层，而不是递归搜索。要递归搜索，您需要使用XPath查询。在XPath中，双斜杠//是递归搜索。试试这个：

# returns a list of elements with tag TrackSummary
response.xpath('//TrackSummary')

# returns a list of the text contained in each TrackSummary tag
response.xpath('//TrackSummary/node()')

Answer 2

import xml.etree.cElementTree as ET # 15 to 20 time faster

response = ET.fromstring(str)

Xpath Syntax 选择所有子元素。例如，* / egg选择所有名为egg的孙子。

element = response.findall('*/TrackSummary') # you will get a list
print element[0].text #fast print else iterate the list

>>> The Postal Service could not locate the tracking informationfor your request. Please verify your tracking number and try again later.

ElementTree XML API与子元素

2 个答案: