我正在尝试使用python合并两个类似的xml文件。
file1.xml:
<data>
<a>
<b>
<c>value1</c>
<d>value2</d>
</b>
<e>
<f>value3</f>
<g>value4</g>
</e>
</a>
</data>
file2.xml
<data>
<a>
<b>
<c>value5</c>
<d>value6</d>
</b>
<e>
<f>value7</f>
<g>value8</g>
</e>
</a>
</data>
Desired Output(file3.xml)。将所有子元素组合用于重复的b元素但不组合重复的e元素。
<data>
<a>
<b>
<c>value1</c>
<d>value2</d>
<c>value5</c>
<d>value6</d>
</b>
<e>
<f>value3</f>
<g>value4</g>
</e>
<e>
<f>value7</f>
<g>value8</g>
</e>
</a>
</data>
答案 0 :(得分:1)
要解决您的问题,我将XML转换为Python dict并手动合并字典。在我将dict重新转换为XML之后。像那样(用Python 2.7测试):
import xmltodict
import collections
def merge_dict(d, u):
"""
Merge two dictionaries. Manage nested dictionary and multiple values with same key.
Return merged dict
"""
for k, v in u.items():
if isinstance(v, collections.Mapping):
d[k] = merge_dict(d.get(k, {}), v)
else:
# No more nested
if k in d:
# Manage multiple values with same name
if not isinstance(d[k], list):
# if not a list create one
d[k] = [d[k]]
d[k].append(v)
else:
# Single value
d[k] = v
return d
if __name__ == "__main__":
# Open input files
with open("file1.xml", "r") as file1_xml, open("file2.xml", "r") as file2_xml:
# Convert xml to dictionary
file1_dict = xmltodict.parse(file1_xml.read())
file2_dict = xmltodict.parse(file2_xml.read())
# Merge dictionaries with special function
file3_dict = merge_dict(file1_dict, file2_dict)
# Open output file
with open("file3.xml", "w") as file3_xml:
file3_xml.write(xmltodict.unparse(file3_dict))
我在转换部分使用xmltodict
模块。要安装它,请使用pip install xmltodict