使用python

时间:2018-04-25 17:48:11

标签: python xml

我正在尝试使用python合并两个类似的xml文件。

file1.xml:

<data>
  <a>
    <b>
      <c>value1</c>
      <d>value2</d>
    </b>
    <e>
      <f>value3</f>
      <g>value4</g>
    </e>
  </a>
</data>

file2.xml

<data>
  <a>
    <b>
      <c>value5</c>
      <d>value6</d>
    </b>
    <e>
      <f>value7</f>
      <g>value8</g>
    </e>
  </a>
</data>

Desired Output(file3.xml)。将所有子元素组合用于重复的b元素但不组合重复的e元素。

<data>
  <a>
    <b>
      <c>value1</c>
      <d>value2</d>
      <c>value5</c>
      <d>value6</d>
    </b>
    <e>
      <f>value3</f>
      <g>value4</g>
    </e>
    <e>
      <f>value7</f>
      <g>value8</g>
    </e>
  </a>
</data>

1 个答案:

答案 0 :(得分:1)

要解决您的问题,我将XML转换为Python dict并手动合并字典。在我将dict重新转换为XML之后。像那样(用Python 2.7测试):

import xmltodict
import collections


def merge_dict(d, u):
    """ 
        Merge two dictionaries. Manage nested dictionary and multiple values with same key.
        Return merged dict 
    """
    for k, v in u.items():
        if isinstance(v, collections.Mapping):
            d[k] = merge_dict(d.get(k, {}), v)
        else:
            # No more nested
            if k in d:
                # Manage multiple values with same name
                if not isinstance(d[k], list):
                    # if not a list create one
                    d[k] = [d[k]]
                d[k].append(v)
            else:
                # Single value
                d[k] = v
    return d


if __name__ == "__main__":
    # Open input files
    with open("file1.xml", "r") as file1_xml, open("file2.xml", "r") as file2_xml:
        # Convert xml to dictionary
        file1_dict = xmltodict.parse(file1_xml.read())
        file2_dict = xmltodict.parse(file2_xml.read())

        # Merge dictionaries with special function
        file3_dict = merge_dict(file1_dict, file2_dict)

        # Open output file
        with open("file3.xml", "w") as file3_xml:
            file3_xml.write(xmltodict.unparse(file3_dict))

我在转换部分使用xmltodict模块。要安装它,请使用pip install xmltodict