如何在Python中展平XML文件

时间:2016-08-09 13:56:58

标签: python xml

是否有可用于展平XML文件的库或机制?

现有:

<A>
    <B>
        <ConnectionType>a</ConnectionType>
        <StartTime>00:00:00</StartTime>
        <EndTime>00:00:00</EndTime>
        <UseDataDictionary>N</UseDataDictionary>

所需:

A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N

3 个答案:

答案 0 :(得分:5)

使用xmltodict将您的XML文件转换为字典,并结合this answer展平dict,这应该是可行的。

示例:

# Original code: https://codereview.stackexchange.com/a/21035
from collections import OrderedDict

def flatten_dict(d):
    def items():
        for key, value in d.items():
            if isinstance(value, dict):
                for subkey, subvalue in flatten_dict(value).items():
                    yield key + "." + subkey, subvalue
            else:
                yield key, value

    return OrderedDict(items())

import xmltodict

# Convert to dict
with open('test.xml', 'rb') as f:
    xml_content = xmltodict.parse(f)

# Flatten dict
flattened_xml = flatten_dict(xml_content)

# Print in desired format
for k,v in flattened_xml.items():
    print('{} = {}'.format(k,v))

输出:

A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N

答案 1 :(得分:0)

这不是一个完整的实现,但您可以利用lxmls's getpath

xml = """<A>
            <B>
               <ConnectionType>a</ConnectionType>
               <StartTime>00:00:00</StartTime>
               <EndTime>00:00:00</EndTime>
               <UseDataDictionary>N
               <UseDataDictionary2>G</UseDataDictionary2>
               </UseDataDictionary>
            </B>
       </A>"""


from lxml import etree
from StringIO import  StringIO
tree = etree.parse(StringIO(xml))

root = tree.getroot().tag
for node in tree.iter():
    for child in node.getchildren():
         if child.text.strip():
            print("{}.{} = {}".format(root, ".".join(tree.getelementpath(child).split("/")), child.text.strip()))

这给了你:

A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
A.B.UseDataDictionary.UseDataDictionary2 = G

答案 2 :(得分:0)

这是 ƘɌỈsƬƠƑ 的改进版本,它也处理嵌套列表:

def flatten_dict(d):
    def items():
        for key, value in d.items():
            # nested subtree
            if isinstance(value, dict):
                for subkey, subvalue in flatten_dict(value).items():
                    yield '{}.{}'.format(key, subkey), subvalue
            # nested list
            elif isinstance(value, list):
                for num, elem in enumerate(value):
                    for subkey, subvalue in flatten_dict(elem).items():
                        yield '{}.[{}].{}'.format(key, num, subkey), subvalue
            # everything else (only leafs should remain)
            else:
                yield key, value