python将某些XML标记从file1合并到文件2

时间:2017-08-23 07:59:33

标签: xml python-2.7 elementtree

我有两个xmls。 XML1和XML2。我想用XML2的内容更新XMl1。

XML1

<vdu xmlns="test:file">
<lsm>
    <m-id>v1</m-id>
    <name>v1</name>
    <communication>bi</communication>
    <states>
        <s-name>stage1</s-name>
        <state>
            <s-type>defaultState</s-type>
            <s-func>
                <p-name>pkgname</p-name>
                <f-list>
                    <f-name>funcNAME</f-name>
                    <f-arg>{&amp;}</f-arg>
                </f-list>
            </s-func>
        </state>
        <lib-path>libpath</lib-path>
        <e-list>
            <e-name>noEvent</e-name>
            <event>
                <nss>INC</nss>
                <nfs>INC</nfs>
                <actions>
                    <p-name>pkgName</p-name>
                    <f-list>
                        <f-name>toF</f-name>
                        <f-arg></f-arg>
                    </f-list>
                </actions>
            </event>
        </e-list>

XML2

<vdu xmlns="test:file">
<lsm>
    <m-id>v1</m-id>
    <name>v1</name>
    <communication>bi</communication>
        <e-list>
            <e-name>noEvent</e-name>
            <event>
                <nss>INC</nss>
                <nfs>INC</nfs>
                <actions>
                    <p-name>pkgName</p-name>
                    <f-list>
                        <f-name>toF</f-name>
                        <f-arg></f-arg>
                    </f-list>
                </actions>
            </event>
        </e-list>

我正在尝试从XML2中检索文本,并更新XML1中的每个元素(我不确定它应该是怎么回事。)这是我到目前为止所遇到的地方。

import xml.etree.ElementTree as ET

source = (r'C:\XML2.xml')
destination = (r'C:\XML1.xml')

source_tree = ET.parse(source)
source_root = source_tree.getroot()


dest_tree = ET.parse(destination)
dest_root = dest_tree.getroot()
xmltag = dest_root.tag

newroot = ET.Element(xmltag)


for source_elem in source_root.iter('e-list'):
    for ele_verbose in source_elem:
        newroot.append(source_elem)      
        for open_network in ele_verbose:
            newroot.append(open_network)

ET.ElementTree(newroot).write(destination, xml_declaration=True, encoding='UTF-8')   # It does write copied elements in new file and does not retain the other TAGS in XML file.

如果你有更好的方法来实现这一点,请建议。

1 个答案:

答案 0 :(得分:1)

这是 不专业 使用lxml的方法(我没有多少经验):

from lxml import etree

source = (r'test_xml2.xml')
destination = (r'test_xml.xml')

root1 = etree.parse(destination).getroot()
root2 = etree.parse(source).getroot()
all_elements1 = root1.findall('.//*')
all_elements2 = root2.findall('.//*')

def complete_tag_check(e1,e2):
    while e1.tag == e2.tag:
        if e1.tag == root1.tag:
            return True
        else:
            e1 = e1.getparent()
            e2 = e2.getparent()
    return False

for el2 in all_elements2:
    remove_el = False
    if el2.text is not None and el2.text.strip()!='':
        for el1 in all_elements1:

##            # this will only compare element tags
##            if el1.tag == el2.tag:
##                el1.text = el2.text
##                remove_el = True
##                break


            # this will compare the tags in each step until
            # it get to root. Only elements with exactly
            # the same structure will qualify.
            if complete_tag_check(el1,el2):
                el1.text = el2.text
                remove_el = True
                break

        # if you have multiple tags with same name in xmls,
        # the last one from xml2 will replace all in xml1.
        # to avoid this we remove each element with changed
        # text from all_elements1 list.
        if remove_el:
            all_elements1.remove(el1)


ET = etree.ElementTree(root1)
ET.write('test_xml3.xml', pretty_print=True)
# wrote to a different file so you could compare results