Python以递归方式对XML元素和标记及属性进行排序

时间:2017-11-03 13:07:21

标签: python xml sorting

我是Python新手,我试图用一些规则对XML进行排序 我的例子:

<?xml version="1.0"?>
<data>
    <e2 id="3" name="name3">
        <e12 num="num12" desc="desc12"/>
        <e12 num="num12" desc="desc11"/>
        <e11 num="num1" desc="desc1"/>
    </e2>
    <e2 id="2" name="name2">
        <e11 num="num1" desc="desc1"/>
    </e2>
    <e1 id="1" name="name1">
        <e12 num="num12" desc="desc12"/>
        <e11 num="num4" desc="desc4"/>
    </e1>
</data>

我的规则是:
1)在各个元素中按名称对每个属性进行排序 2)分类元素
*按标签名称(如果没有属性)
*如果标签名称按其属性顺序相同

在我的情况下,我需要先排序e1然后排序e2,
因为我有2个e2元素,我需要分别按其属性名称对它们进行排序,比如一个有id = 2,第二个有id = 3,所以顺序应该由id值完成。
所需的输出XML如下所示:

<?xml version="1.0"?>
<data>
    <e1 id="1" name="name1">
        <e11 desc="desc4" num="num4"/>
        <e12 desc="desc12" num="num12"/>
    </e1>
    <e2 id="2" name="name2">
        <e11 desc="desc1" num="num1"/>
    </e2>
    <e2 id="3" name="name3">
        <e11 num="num1" desc="desc1"/>
        <e12 desc="desc11" num="num12"/>
        <e12 desc="desc12" num="num12"/>
    </e2>
</data>

任何建议或想法如何做到这一点?
谢谢。

2 个答案:

答案 0 :(得分:2)

您可以使用ElementTree对XML进行排序。在我的示例中,我首先使用标记名称对其进行排序,然后使用attribut&#39; name&#39;和tag-name的子元素以及attribut&#39; desc&#39;

的值
import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstr))
root = tree.getroot()

# sort the first layer
root[:] = sorted(root, key=lambda child: (child.tag,child.get('name')))

# sort the second layer
for c in root:
    c[:] = sorted(c, key=lambda child: (child.tag,child.get('desc')))

xmlstr = ET.tostring(root, encoding="utf-8", method="xml")
print(xmlstr.decode("utf-8"))

打印

<data>
<e1 id="1" name="name1">
    <e11 desc="desc4" num="num4" />
    <e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
    <e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
    <e11 desc="desc1" num="num1" />
    <e12 desc="desc11" num="num12" />
    <e12 desc="desc12" num="num12" />
</e2>
</data>

答案 1 :(得分:1)

xml.etree.ElementTree对象的解决方案:

import xml.etree.ElementTree as ET

tree = ET.parse('input.xml')
data = tree.getroot()
els = data.findall("*[@id]")   # all e<number> elements having `id` attribute
new_els = sorted(els, key=lambda el: (el.tag, el.attrib['id']))
for el in new_els:
    el[:] = sorted(el, key=lambda e: (e.tag, e.attrib['desc']))
data[:] = new_els

tree.write('result.xml', xml_declaration=True, encoding='utf-8')

最终result.xml内容:

<?xml version='1.0' encoding='utf-8'?>
<data>
    <e1 id="1" name="name1">
        <e11 desc="desc4" num="num4" />
    <e12 desc="desc12" num="num12" />
        </e1>
<e2 id="2" name="name2">
        <e11 desc="desc1" num="num1" />
    </e2>
    <e2 id="3" name="name3">
        <e11 desc="desc1" num="num1" />
    <e12 desc="desc11" num="num12" />
        <e12 desc="desc12" num="num12" />
        </e2>
    </data>