使用xpath从xml提取值

时间:2018-08-10 17:06:19

标签: xml xpath

我从api调用中获得了以下XML响应:

<?xml version='1.0' encoding='ISO-8859-1'?>
<PPRESULTS s="DEV12" lst="8/10/2018 10:27:06 AM">
    <Results success="1" api="0" rolename="user" toolbarcode="standard" version="7.0">
        <usercontext status=" " managedbilling="False" managedmf="False" masteroffice="False" msooffice="False">xxxxxLGLQs+mbLDJ3X/zNwxdeehwhEathbBoHMVgLGnbNt7X8NcI8Y7KXwO+oOrRlnWscVxoUyo/E6WUPMkPWP8aSOW9ofwFL3b6mtFDR/GLLoJIFbduGD8civ9xF/KNyd8ceXmBc6/wi3wtyvrExjkEqbHwNL6aW60FrioUZo9eW4Z2BVkT3Xaqk4He+fx1ibp8XgEGklWKa7FoA7JEvtqcgLw==</usercontext>
    </Results>
</PPRESULTS>

然后使用XPATH提取此文本:

xxxxxLGLQs+mbLDJ3X/zNwxdeehwhEathbBoHMVgLGnbNt7X8NcI8Y7KXwO+oOrRlnWscVxoUyo/E6WUPMkPWP8aSOW9ofwFL3b6mtFDR/GLLoJIFbduGD8civ9xF/KNyd8ceXmBc6/wi3wtyvrExjkEqbHwNL6aW60FrioUZo9eW4Z2BVkT3Xaqk4He+fx1ibp8XgEGklWKa7FoA7JEvtqcgLw==

我有这个xpath PPRESULTS/Results[1]/usercontext[1],但是它提取了<usercontext </usercontext>中的所有内容。我如何仅提取该文本?请记住,文本总是在变化,这是一个标记。

1 个答案:

答案 0 :(得分:0)

使用浏览器中的javascript,您可以

document.evaluate('//usercontext//text()', document, null, XPathResult.ANY_TYPE, null).iterateNext().textContent

(但请查看https://developer.mozilla.org/en-US/docs/Web/JavaScript/Introduction_to_using_XPath_in_JavaScript了解更多详细信息)。

如果您使用的是Python,则可以执行以下操作:

>>> from lxml import etree
>>> doc = etree.parse(open('foo.xml', 'rb'))
>>> print doc.xpath('//usercontext//text()')
['xxxxxLGLQs+mbLDJ3X/zNwxdeehwhEathbBoHMVgLGnbNt7X8NcI8Y7KXwO+oOrRlnWscVxoUyo/E6WUPMkPWP8aSOW9ofwFL3b6mtFDR/GLLoJIFbduGD8civ9xF/KNyd8ceXmBc6/wi3wtyvrExjkEqbHwNL6aW60FrioUZo9eW4Z2BVkT3Xaqk4He+fx1ibp8XgEGklWKa7FoA7JEvtqcgLw==']

在任何情况下,您都需要使用text()函数来获取节点的文本内容。