我正在尝试将某种XML(现在使用xml.etree.ElementTree)执行到JSON(并研究Python,在我真正的非关键任务中使用它)。 XML示例:
<ExportData name="ExportData" hwId="0120">
<input name="Ethernet" type="Ethernet" id="100" numTs="0" ... />
<input name="ASI" type="ASI" id="0" numTs="1" ... >
<setup name="ASI Input 1" id="1" description="ASI" tsSync="no" currentlyMonitored="true" ... />
</input>
<input name="FD1" type="FD" id="1" numTs="1" ... >
<setup name="NewPreset1" id="1" description="642 MHz" ... />
</input>
<input name="FD2" type="FD" id="2" numTs="0" ... />
</ExportData>
我当前的任务是:对于所有具有名称“setup”的子节点的“输入”节点,获取公共(连接)名称和ID(例如上面:name =“ASI:ASI Input 1”和id = “0:1”),然后获取两个节点的所有属性 - 当前和子节点,除了名称和id(例如上面:numTs,description,tsSync,...)
我有很多“googled”代码示例,基于不同的主体(xpath,if / for root.childNodes等),现在我可以从父节点或子节点之一(以不同方式)提取属性),但我钢铁不能得到所有这些......
然后,我需要在JSON中打印解析数据,如下所示:
{
"data":[
{ "{#INPUTID}":"0:1", "{#INPUTNAME}":"ASI:ASI Input 1", "{#INPUTPARAM}":"numTs" },
{ "{#INPUTID}":"0:1", "{#INPUTNAME}":"ASI:ASI Input 1", "{#INPUTPARAM}":"tsSync" },
{ "{#INPUTID}":"1:1", "{#INPUTNAME}":"FD1:NewPreset1", "{#INPUTPARAM}":"description" }
...
]
}
(JSON是人类可读的,对于任何有效的JSON来说都足够了。)
如何以gracefull python方式解决我的任务? (使用整洁的算法和正确的错误和异常处理?)。先谢谢!
UPD 我的进展:
ExportData = ET.fromstring(xml)
# First, create empty Output Dict by Template
# It will be implemented with needet data later
outData = { 'data': [] }
# Then I create 2 Dicts, for node & subnode, if subnode consists
# necessery pattern
# All further manipulations will bi done with this Dicts
for input in ExportData.findall('input'):
if input.find('tuningSetup') is not None:
inputParams = input.attrib
setupParams = input.find('tuningSetup').attrib
inputId = inputParams['id'] + ':' + setupParams['id']
inputName = inputParams['name'] + ':' + setupParams['name']
del inputParams['name'], inputParams['id'] #, inputParams['numTs']
del setupParams['name'], setupParams['id'] #, setupParams['numTs']
commonParams = dict(inputParams.items() + setupParams.items())
for param, value in commonParams.iteritems():
outData['data'].append({ '{#INPUTID}': inputId, '{#INPUTNAME}': inputName, '{#INPUTPARAM}': param}
)
# Finally, dumping data to json
print json.dumps(outData, sort_keys=True, indent=2)
答案 0 :(得分:0)
这里有一些代码可以帮助您入门:
s = """<ExportData name="ExportData" hwId="0120">
<input name="Ethernet" type="Ethernet" id="100" numTs="0" />
<input name="ASI" type="ASI" id="0" numTs="1" >
<setup name="ASI Input 1" id="1" description="ASI" tsSync="no" currentlyMonitored="true" />
</input>
<input name="FD1" type="FD" id="1" numTs="1" >
<setup name="NewPreset1" id="1" description="642 MHz" />
</input>
<input name="FD2" type="FD" id="2" numTs="0" />
</ExportData>"""
tree = ET.fromstring(s)
for node in tree.iter('input'):
child = next((c for c in node if c.tag == 'setup'), None)
if child is None:
continue
else:
print node, child
这导致以下输出:
<Element 'input' at 0x1047541d0> <Element 'setup' at 0x104754210>
<Element 'input' at 0x104754250> <Element 'setup' at 0x104754290>
这将为您提供父节点和子节点。从那里,您可以使用node.attrib
和child.attrib
轻松获取其属性。然后,只需将您想要组合在一起的属性组合在一起来格式化其余属性。