我有两个XML文件,如下 - XML1
<root>
<info>
<order>2</order>
<infoID>Information</infoID>
<child>
<info>
<order>1</order>
<infoID>AAAAA</infoID>
</info>
<info>
<order>2</order>
<infoID>BBBBB</infoID>
</info>
<info>
<order>3</order>
<infoID>CCC</infoID>
</info>
<info>
<order>4</order>
<infoID>DD</infoID>
</info>
<info>
<order>5</order>
<infoID>EEEEE</infoID>
</info>
</child>
</info>
</root>
XML2
<root>
<Name>XYZ</Name>
<ID>1234</ID>
<Data>
<info>
<desc>Data1</desc>
<displayName>dID</displayName>
<infoID>AAAAA</infoID>
<type>String</type>
</info>
<info>
<desc>Data2</desc>
<displayName>sID/displayName>
<infoID>BBBBB</infoID>
<type>String</type>
</info>
</Data>
</root>
我想在infoID匹配时将它们组合起来,例如,当infoID为AAAAA时,它需要获取一个子节点下的所有相关行。
<root>
<Name>XYZ</Name>
<ID>1234</ID>
<Data>
<info>
<order>2</order>
<infoID>Information</infoID>
<child>
<info>
<order>1</order>
<infoID>AAAAA</infoID>
<desc>Data1</desc>
<displayName>dID</displayName>
<type>String</type>
</info>
<info>
<order>2</order>
<infoID>BBBBB</infoID>
<desc>Data2</desc>
<displayName>sID/displayName>
<type>String</type>
</info>
<info>
<order>3</order>
<infoID>CCC</infoID>
</info>
<info>
<order>4</order>
<infoID>DD</infoID>
</info>
<info>
<order>5</order>
<infoID>EEEEE</infoID>
</info>
</child>
</info>
</root>
我尝试了下面的代码,但它只是连接一个文件
from xml.etree import ElementTree as et
out = open ("combined.xml", "wb")
class XMLCombiner(object):
def __init__(self, filenames):
assert len(filenames) > 0, 'No filenames!'
self.roots = [et.parse(f).getroot() for f in filenames]
def combine(self):
for r in self.roots[1:]:
self.combine_element(self.roots[0], r)
return et.tostring(self.roots[0])
def combine_element(self, one, other):
one.attrib.update(other.attrib)
mapping = {el.tag: el for el in one}
for el in other:
if len(el) == 0:
try:
mapping[el.tag].text = el.text
except KeyError:
mapping[el.tag] = el
one.append(el)
else:
try:
self.combine_element(mapping[el.tag], el)
except KeyError:
mapping[el.tag] = el
one.append(el)
if __name__ == '__main__':
r = XMLCombiner(('XML1.xml', 'XML2.xml')).combine()
print >> out, r