Question

考虑以下XML（我存储在字符串变量data中）：

<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing">
    <s:Header>
        <a:Action s:mustUnderstand="1">foo</a:Action>
    </s:Header>
    <s:Body>
        <ns1:RetrieveStoryML_Response_1 xmlns:ns0="http://www.bar.com" xmlns:ns1="http://www.foo.bar">
            <ns1:StoryMLResponse>
                <ns1:STORYML xmlns="http://www.none.com">
                    <HL space="preserve" xmlns:ns3="http://www.foo.foo.foo">
                        <ID>12345</ID>
                        <TE>This is the text I'd really like to get</TE>
                    </HL>
                </ns1:STORYML>
            </ns1:StoryMLResponse>
        </ns1:RetrieveStoryML_Response_1>
    </s:Body>
</s:Envelope>

我正试图摆脱HL标签中未包含的所有内容。我的预期输出是：

<?xml version="1.0" encoding="UTF-8"?>
<HL space="preserve" xmlns:ns3="http://www.foo.foo.foo">
    <ID>12345</ID>
    <TE>This is the text I'd really like to get</TE>
</HL>

所以我加载数据：

import xml.etree.ElementTree as ET
root = ET.fromstring(data)

root现在是这样的：

<Element '{http://www.w3.org/2003/05/soap-envelope}Envelope' at 0x000000000BE6D4A8>

然后我尝试使用这样的findall方法，遵循XML Xpath docs：

root.findall('.//HL')

但是我回复了一个空列表。如何有效地过滤此XML？

“过滤”或选择XML中给定标记下的所有内容

0 个答案: