使用lxml查找xml节点python的所有祖先

时间:2015-02-11 10:36:05

标签: python xml lxml

我正在尝试找到节点的所有祖先。

我的xml,

xmldata="""
<OrganizationTreeInfo xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/YSM.PMS.Web.Service.DataTransfer.Models">
<Name>Parent</Name>
<OrganizationId>4345</OrganizationId>
<Children>
    <OrganizationTreeInfo>
    <Name>A</Name>
    <OrganizationId>123</OrganizationId>
        <Children>
            <OrganizationTreeInfo>
                <Name>B</Name>
                <OrganizationId>54</OrganizationId>
                <Children/>
            </OrganizationTreeInfo>
        </Children>
    </OrganizationTreeInfo>
    <OrganizationTreeInfo>
    <Name>C</Name>
    <OrganizationId>34</OrganizationId>
        <Children>
            <OrganizationTreeInfo>
                <Name>D</Name>
                <OrganizationId>32323</OrganizationId>
                <Children>
                    <OrganizationTreeInfo>
                        <Name>E</Name>
                        <OrganizationId>3234</OrganizationId>
                        <Children/>
                    </OrganizationTreeInfo>
                </Children>
            </OrganizationTreeInfo>
        </Children>
    </OrganizationTreeInfo>
</Children>

&#34;&#34;&#34;

例如如果我输入OrganizationId的值为3234,那么输出应该是,

{'parent':4345,'C':34,'D':32323,'E':3234 }

这是我的尝试,

root = ET.fromstring(xmldata)
for target in root.xpath('.//OrganizationTreeInfo/OrganizationId[text()="3234"]'):
    d = {
        dept.find('Name').text: int(dept.find('OrganizationId').text)
        for dept in target.xpath('ancestor-or-self::OrganizationTreeInfo')
    }
    print(d)

但它没有提供任何输出。我无法找出它的错误。

1 个答案:

答案 0 :(得分:1)

由于命名空间,您无法得到正确的答案 xmlns="http://schemas.datacontract.org/2004/07/YSM.PMS.Web.Service.DataTransfer.Models"

使用命名空间代码:

代码:

import lxml.etree as ET

root = ET.fromstring(xmldata)

result = {}
count = 1
namespaces1={'xmlns':'http://schemas.datacontract.org/2004/07/YSM.PMS.Web.Service.DataTransfer.Models',}
for target in root.xpath('.//xmlns:OrganizationTreeInfo/xmlns:OrganizationId[text()="3234"]',\
                         namespaces=namespaces1):
    result[count] = {}
    for dept in target.xpath('ancestor-or-self::xmlns:OrganizationTreeInfo', namespaces=namespaces1):
            result[count][dept.find('xmlns:Name', namespaces=namespaces1).text] = int(dept.find('xmlns:OrganizationId', namespaces=namespaces1).text)

    count += 1

import pprint
pprint.pprint(result)

输出:

:~/workspace/vtestproject/study$ python test1.py
{1: {'C': 34, 'D': 32323, 'E': 3234, 'Parent': 4345}}

xmlns=字符串替换为其他临时字符串。

代码:

import lxml.etree as ET

new_xmldata = xmldata.replace("xmlns=", "xmlnamespace=")

root = ET.fromstring(new_xmldata)#, namespace="{http://schemas.datacontract.org/2004/07/YSM.PMS.Web.Service.DataTransfer.Models}")

result = {}
count = 1
for target in root.xpath('.//OrganizationTreeInfo/OrganizationId[text()="3234"]'):
    result[count] = {}
    for dept in target.xpath('ancestor-or-self::OrganizationTreeInfo'):
            result[count][dept.find('Name').text] = int(dept.find('OrganizationId').text)

    count += 1

import pprint
pprint.pprint(result)

输出:

:~/workspace/vtestproject/study$ python test1.py
{1: {'C': 34, 'D': 32323, 'E': 3234, 'Parent': 4345}}