是否有可用于展平XML文件的库或机制?
现有:
<A>
<B>
<ConnectionType>a</ConnectionType>
<StartTime>00:00:00</StartTime>
<EndTime>00:00:00</EndTime>
<UseDataDictionary>N</UseDataDictionary>
所需:
A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
答案 0 :(得分:5)
使用xmltodict
将您的XML文件转换为字典,并结合this answer展平dict
,这应该是可行的。
示例:
# Original code: https://codereview.stackexchange.com/a/21035
from collections import OrderedDict
def flatten_dict(d):
def items():
for key, value in d.items():
if isinstance(value, dict):
for subkey, subvalue in flatten_dict(value).items():
yield key + "." + subkey, subvalue
else:
yield key, value
return OrderedDict(items())
import xmltodict
# Convert to dict
with open('test.xml', 'rb') as f:
xml_content = xmltodict.parse(f)
# Flatten dict
flattened_xml = flatten_dict(xml_content)
# Print in desired format
for k,v in flattened_xml.items():
print('{} = {}'.format(k,v))
输出:
A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
答案 1 :(得分:0)
这不是一个完整的实现,但您可以利用lxmls's getpath:
xml = """<A>
<B>
<ConnectionType>a</ConnectionType>
<StartTime>00:00:00</StartTime>
<EndTime>00:00:00</EndTime>
<UseDataDictionary>N
<UseDataDictionary2>G</UseDataDictionary2>
</UseDataDictionary>
</B>
</A>"""
from lxml import etree
from StringIO import StringIO
tree = etree.parse(StringIO(xml))
root = tree.getroot().tag
for node in tree.iter():
for child in node.getchildren():
if child.text.strip():
print("{}.{} = {}".format(root, ".".join(tree.getelementpath(child).split("/")), child.text.strip()))
这给了你:
A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
A.B.UseDataDictionary.UseDataDictionary2 = G
答案 2 :(得分:0)
这是 ƘɌỈsƬƠƑ 的改进版本,它也处理嵌套列表:
def flatten_dict(d):
def items():
for key, value in d.items():
# nested subtree
if isinstance(value, dict):
for subkey, subvalue in flatten_dict(value).items():
yield '{}.{}'.format(key, subkey), subvalue
# nested list
elif isinstance(value, list):
for num, elem in enumerate(value):
for subkey, subvalue in flatten_dict(elem).items():
yield '{}.[{}].{}'.format(key, num, subkey), subvalue
# everything else (only leafs should remain)
else:
yield key, value