我是Python新手,我试图用一些规则对XML进行排序 我的例子:
<?xml version="1.0"?>
<data>
<e2 id="3" name="name3">
<e12 num="num12" desc="desc12"/>
<e12 num="num12" desc="desc11"/>
<e11 num="num1" desc="desc1"/>
</e2>
<e2 id="2" name="name2">
<e11 num="num1" desc="desc1"/>
</e2>
<e1 id="1" name="name1">
<e12 num="num12" desc="desc12"/>
<e11 num="num4" desc="desc4"/>
</e1>
</data>
我的规则是:
1)在各个元素中按名称对每个属性进行排序
2)分类元素
*按标签名称(如果没有属性)
*如果标签名称按其属性顺序相同
在我的情况下,我需要先排序e1然后排序e2,
因为我有2个e2元素,我需要分别按其属性名称对它们进行排序,比如一个有id = 2,第二个有id = 3,所以顺序应该由id值完成。
所需的输出XML如下所示:
<?xml version="1.0"?>
<data>
<e1 id="1" name="name1">
<e11 desc="desc4" num="num4"/>
<e12 desc="desc12" num="num12"/>
</e1>
<e2 id="2" name="name2">
<e11 desc="desc1" num="num1"/>
</e2>
<e2 id="3" name="name3">
<e11 num="num1" desc="desc1"/>
<e12 desc="desc11" num="num12"/>
<e12 desc="desc12" num="num12"/>
</e2>
</data>
任何建议或想法如何做到这一点?
谢谢。
答案 0 :(得分:2)
您可以使用ElementTree对XML进行排序。在我的示例中,我首先使用标记名称对其进行排序,然后使用attribut&#39; name&#39;和tag-name的子元素以及attribut&#39; desc&#39;
的值import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstr))
root = tree.getroot()
# sort the first layer
root[:] = sorted(root, key=lambda child: (child.tag,child.get('name')))
# sort the second layer
for c in root:
c[:] = sorted(c, key=lambda child: (child.tag,child.get('desc')))
xmlstr = ET.tostring(root, encoding="utf-8", method="xml")
print(xmlstr.decode("utf-8"))
打印
<data>
<e1 id="1" name="name1">
<e11 desc="desc4" num="num4" />
<e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
<e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
<e11 desc="desc1" num="num1" />
<e12 desc="desc11" num="num12" />
<e12 desc="desc12" num="num12" />
</e2>
</data>
答案 1 :(得分:1)
xml.etree.ElementTree
对象的解决方案:
import xml.etree.ElementTree as ET
tree = ET.parse('input.xml')
data = tree.getroot()
els = data.findall("*[@id]") # all e<number> elements having `id` attribute
new_els = sorted(els, key=lambda el: (el.tag, el.attrib['id']))
for el in new_els:
el[:] = sorted(el, key=lambda e: (e.tag, e.attrib['desc']))
data[:] = new_els
tree.write('result.xml', xml_declaration=True, encoding='utf-8')
最终result.xml
内容:
<?xml version='1.0' encoding='utf-8'?>
<data>
<e1 id="1" name="name1">
<e11 desc="desc4" num="num4" />
<e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
<e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
<e11 desc="desc1" num="num1" />
<e12 desc="desc11" num="num12" />
<e12 desc="desc12" num="num12" />
</e2>
</data>