Question

我正在尝试使用python解析下面的xml。我不明白这是哪种类型的xml，因为我从未使用过这种xml。我只是从微软的api响应中得到它。

现在我的问题是如何在我的python代码中解析并获取BinarySecurityToken的值。

我引用此问题Parse XML SOAP response with Python

但是看起来这也有一些xmlns来获取文本。但是在我的xml中，我无法看到任何附近的xmlns值，我可以得到该值。

请告诉我如何使用python从xml以下获取特定字段的值。

<?xml version="1.0" encoding="utf-8" ?>
<S:Envelope xmlns:S="http://www.w3.org/2003/05/soap-envelope" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" xmlns:wsa="http://www.w3.org/2005/08/addressing">
  <S:Header>
    <wsa:Action xmlns:S="http://www.w3.org/2003/05/soap-envelope" xmlns:wsa="http://www.w3.org/2005/08/addressing" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" wsu:Id="Action" S:mustUnderstand="1">http://schemas.xmlsoap.org/ws/2005/02/trust/RSTR/Issue</wsa:Action>
    <wsa:To xmlns:S="http://www.w3.org/2003/05/soap-envelope" xmlns:wsa="http://www.w3.org/2005/08/addressing" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" wsu:Id="To" S:mustUnderstand="1">http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
    <wsse:Security S:mustUnderstand="1">
      <wsu:Timestamp xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" wsu:Id="TS">
        <wsu:Created>2017-06-12T10:23:01Z</wsu:Created>
        <wsu:Expires>2017-06-12T10:28:01Z</wsu:Expires>
      </wsu:Timestamp>
    </wsse:Security>
  </S:Header>
  <S:Body>
    <wst:RequestSecurityTokenResponse xmlns:S="http://www.w3.org/2003/05/soap-envelope" xmlns:wst="http://schemas.xmlsoap.org/ws/2005/02/trust" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" xmlns:saml="urn:oasis:names:tc:SAML:1.0:assertion" xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy" xmlns:psf="http://schemas.microsoft.com/Passport/SoapServices/SOAPFault">
      <wst:TokenType>urn:passport:compact</wst:TokenType>
      <wsp:AppliesTo xmlns:wsa="http://www.w3.org/2005/08/addressing">
        <wsa:EndpointReference>
          <wsa:Address>https://something.something.something.com</wsa:Address>
        </wsa:EndpointReference>
      </wsp:AppliesTo>
      <wst:Lifetime>
        <wsu:Created>2017-06-12T10:23:01Z</wsu:Created>
        <wsu:Expires>2017-06-13T10:23:01Z</wsu:Expires>
      </wst:Lifetime>
      <wst:RequestedSecurityToken>
        <wsse:BinarySecurityToken Id="Compact0">my token</wsse:BinarySecurityToken>
      </wst:RequestedSecurityToken>
      <wst:RequestedAttachedReference>
        <wsse:SecurityTokenReference>
          <wsse:Reference URI="wwwww=">
          </wsse:Reference>
        </wsse:SecurityTokenReference>
      </wst:RequestedAttachedReference>
      <wst:RequestedUnattachedReference>
        <wsse:SecurityTokenReference>
          <wsse:Reference URI="swsw=">
          </wsse:Reference>
        </wsse:SecurityTokenReference>
      </wst:RequestedUnattachedReference>
    </wst:RequestSecurityTokenResponse>
  </S:Body>
</S:Envelope>

Answer 1

此声明是根元素的开始标记的一部分：

xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"

这意味着具有wsse前缀的元素（例如BinarySecurityToken）位于http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd命名空间中。

解决方案与链接问题的答案基本相同。它只是另一个命名空间：

import xml.etree.ElementTree as ET

tree = ET.parse('soap.xml')    
print tree.find('.//{http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd}BinarySecurityToken').text

以下是另一种方法：

import xml.etree.ElementTree as ET

ns = {"wsse": "http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"}
tree = ET.parse('soap.xml') 
print tree.find('.//wsse:BinarySecurityToken', ns).text

两种情况下的输出都是my token。

请参阅https://docs.python.org/2.7/library/xml.etree.elementtree.html#parsing-xml-with-namespaces。

Answer 2

创建名称空间字典可以帮助我。谢谢@mzjn链接该文章。

在SOAP响应中，我发现必须使用元素的完整路径来提取文本。

例如，我正在使用FEDEX API，我需要找到的一个元素是TrackDetails。我最初的.find()看起来像.find('{http://fedex.com/ws/track/v16}TrackDetails')

我能够简化为以下内容：

ns = {'TrackDetails': 'http://fedex.com/ws/track/v16'}
tree.find('TrackDetails:TrackDetails',ns)

您会两次看到TrackDetails，因为我在dict中为键TrackDetails命名了，但是您可以随意命名。只是帮助我记住了我在项目中的工作，但是:之后的TrackDetails是我需要的SOAP响应中的实际元素。

希望这对某人有帮助！

在Python中解析soap / XML响应

2 个答案: