我想合并两个XML文件。我阅读了很多解决方案,但它们特定于这些文件。我使用xml.etree.ElementTree
以及lxml
进行解析,比较文件,获取差异。我理解我的下一步是:
for element in file2.xml:
if element present in file1.xml:
append to output_file.xml
else:
copy element to the output_file
但是我还没有在XML上工作,并且合并的工具是许可的,所以我需要编写一个通用脚本来合并到我想要的格式。
file1.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<great_grands>
<great_grandpa_name_one>great_grandpa_name</great_grandpa_name_one>
<grandpa>
<grandpa_name>grandpa_name_one_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name>grandpa_name_two_1</grandpa_name>
</grandpa>
<grandma>
<grandma_name>grandma_name_one_1</grandma_name>
</grandma>
<grandma>
<grandma_name>grandma_name_two_1</grandma_name>
</grandma>
</great_grands>
file2.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<great_grands>
<great_grandpa_name_two>great_grandpa_name</great_grandpa_name_two>
<grandpa>
<grandpa_name_2>grandpa_name_one_2</grandpa_name_2>
</grandpa>
<grandma>
<grandma_name_2>grandma_name_one_2</grandma_name_2>
</grandma>
</great_grands>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<great_grands>
<great_grandpa_name_one>great_grandpa_name</great_grandpa_name_one>
<great_grandma_name_two>great_grandma_name</great_grandma_name_two>
<grandpa>
<grandpa_name>grandpa_name_one_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name>grandpa_name_two_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name_2>grandpa_name_one_2</grandpa_name_2>
</grandpa>
<grandma>
<grandma_name>grandma_name_one_1</grandma_name>
</grandma>
<grandma>
<grandma_name>grandma_name_two_1</grandma_name>
</grandma>
<grandma>
<grandma_name_2>grandma_name_one_2</grandma_name_2>
</grandma>
</great_grands>
答案 0 :(得分:0)
考虑XSLT,这是一种特殊用途的声明性语言和XPath的兄弟,旨在转换XML文件。使用其document()
函数,它可以从相对链接的外部XML文件进行解析。 Python的lxml
模块可以处理XSLT 1.0脚本。
因为XSLT脚本是格式良好的XML文件,所以可以从文件或嵌入字符串中解析。下面假设所有文件和脚本都保存在同一目录中:
XSLT 脚本(另存为.xsl脚本,请注意仅引用file2.xml)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="/great_grands">
<xsl:copy>
<xsl:copy-of select="great_grandpa_name_one"/>
<xsl:copy-of select="document('file2.xml')/great_grands/great_grandpa_name_two"/>
<xsl:copy-of select="grandpa"/>
<xsl:copy-of select="document('file2.xml')/great_grands/grandpa"/>
<xsl:copy-of select="grandma"/>
<xsl:copy-of select="document('file2.xml')/great_grands/grandma"/>
</xsl:copy>
</xsl:template>
</xsl:transform>
Python 脚本(请注意仅引用file1.xml)
from lxml import etree
xml = etree.parse('file1.xml')
xsl = etree.parse('XSLTScript.xsl')
transform = etree.XSLT(xsl)
newdom = transform(xml)
# SAVE NEW DOM STRING TO FILE
with open('Output.xml', 'wb') as f:
f.write(newdom)
<强>输出强>
<?xml version="1.0" encoding="UTF-8"?>
<great_grands>
<great_grandpa_name_one>great_grandpa_name</great_grandpa_name_one>
<great_grandpa_name_two>great_grandpa_name</great_grandpa_name_two>
<grandpa>
<grandpa_name>grandpa_name_one_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name>grandpa_name_two_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name_2>grandpa_name_one_2</grandpa_name_2>
</grandpa>
<grandma>
<grandma_name>grandma_name_one_1</grandma_name>
</grandma>
<grandma>
<grandma_name>grandma_name_two_1</grandma_name>
</grandma>
<grandma>
<grandma_name_2>grandma_name_one_2</grandma_name_2>
</grandma>
</great_grands>