我知道之前已经问过这个问题,但是我正在努力让它与我的例子一起工作,并且非常感谢一些帮助。 我想要实现的目标看起来相当简单: 我有2个文件,1个类似于下面的文件,第二个几乎相同,只是它只有LAYER然后是TEST NAME - 即。没有MASTER。
<MASTER>
<LAYER NAME="LAYER B">
<TEST NAME="Soup1">
<TITLE>Title 2</TITLE>
<SCRIPTFILE>PAth 2</SCRIPTFILE>
<ASSET_FILE PATH="Path 22" />
<ARGS>
<ARG ID="arg_21">some_Arg11</ARG>
<ARG ID="arg_22">some_Arg12</ARG>
</ARGS>
<TIMEOUT OSTYPE="111">1200</TIMEOUT>
</TEST>
<TEST NAME="Bread2">
<TITLE>Title 1</TITLE>
<SCRIPTFILE>PAth 1</SCRIPTFILE>
<ASSET_FILE PATH="Path 11" />
<ARGS>
<ARG ID="arg_11">some_Arg12</ARG>
<ARG ID="arg_12">some_Arg22</ARG>
</ARGS>
<TIMEOUT OSTYPE="2222">1000</TIMEOUT>
</TEST>
</LAYER>
<LAYER NAME="LAYER A">
<TEST NAME="Soup2">
<TITLE>Title 2</TITLE>
<SCRIPTFILE>PAth 2</SCRIPTFILE>
<ASSET_FILE PATH="Path 22" />
<ARGS>
<ARG ID="arg_21">some_Arg11</ARG>
<ARG ID="arg_22">some_Arg12</ARG>
</ARGS>
<TIMEOUT OSTYPE="111">1200</TIMEOUT>
</TEST>
<TEST NAME="Bread2">
<TITLE>Title 1</TITLE>
<SCRIPTFILE>PAth 1</SCRIPTFILE>
<ASSET_FILE PATH="Path 11" />
<ARGS>
<ARG ID="arg_11">some_Arg12</ARG>
<ARG ID="arg_12">some_Arg22</ARG>
</ARGS>
<TIMEOUT OSTYPE="2222">1000</TIMEOUT>
</TEST>
</LAYER>
</MASTER>
我想要做的就是根据NAME对这些文件进行排序,尊重各个层。
在上面的场景中,LAYER A应该在LAYER B之前,在每一层中,它们应该由NAME订购,因此在Soup之前面包。 对于我的第二种情况,我没有这些子图层。
<LAYER>
<TEST NAME="Soup1">
<TITLE>Title 2</TITLE>
<SCRIPTFILE>PAth 2</SCRIPTFILE>
<ASSET_FILE PATH="Path 22" />
<ARGS>
<ARG ID="arg_21">some_Arg11</ARG>
<ARG ID="arg_22">some_Arg12</ARG>
</ARGS>
<TIMEOUT OSTYPE="111">1200</TIMEOUT>
</TEST>
<TEST NAME="Bread2">
<TITLE>Title 1</TITLE>
<SCRIPTFILE>PAth 1</SCRIPTFILE>
<ASSET_FILE PATH="Path 11" />
<ARGS>
<ARG ID="arg_11">some_Arg12</ARG>
<ARG ID="arg_12">some_Arg22</ARG>
</ARGS>
<TIMEOUT OSTYPE="2222">1000</TIMEOUT>
</TEST>
</LAYER>
我希望它们按TEST NAME排序。
先谢谢你们的帮助,我们将不胜感激。
答案 0 :(得分:13)
使用ElementTree可以执行此操作:
import xml.etree.ElementTree as ET
def sortchildrenby(parent, attr):
parent[:] = sorted(parent, key=lambda child: child.get(attr))
tree = ET.parse('input.xml')
root = tree.getroot()
sortchildrenby(root, 'NAME')
for child in root:
sortchildrenby(child, 'NAME')
tree.write('output.xml')
答案 1 :(得分:0)
如果您想以递归方式排序,处理注释并对所有属性进行排序:
#!/usr/bin/env python
# encoding: utf-8
from __future__ import print_function
import logging
from lxml import etree
def get_node_key(node, attr=None):
"""Return the sorting key of an xml node
using tag and attributes
"""
if attr is None:
return '%s' % node.tag + ':'.join([node.get(attr)
for attr in sorted(node.attrib)])
if attr in node.attrib:
return '%s:%s' % (node.tag, node.get(attr))
return '%s' % node.tag
def sort_children(node, attr=None):
""" Sort children along tag and given attribute.
if attr is None, sort along all attributes"""
if not isinstance(node.tag, str): # PYTHON 2: use basestring instead
# not a TAG, it is comment or DATA
# no need to sort
return
# sort child along attr
node[:] = sorted(node, key=lambda child: get_node_key(child, attr))
# and recurse
for child in node:
sort_children(child, attr)
def sort(unsorted_file, sorted_file, attr=None):
"""Sort unsorted xml file and save to sorted_file"""
tree = etree.parse(unsorted_file)
root = tree.getroot()
sort_children(root, attr)
sorted_unicode = etree.tostring(root,
pretty_print=True,
encoding='unicode')
with open(sorted_file, 'w') as output_fp:
output_fp.write('%s' % sorted_unicode)
logging.info('written sorted file %s', sorted_unicode)
注意:我使用的是lxml.etree(http://lxml.de/tutorial.html)