我有一个python XML解析问题,我似乎无法弄清楚。
我有以下XML:
<data>
<data_in base="base64">
</data_in>
<log_sense_data>
<ds base="bool">1</ds>
<spf base="bool">0</spf>
<page_code base="hex">15</page_code>
<background_scan_results_log_page>
<parameter>
<parameter_code base="hex">0000</parameter_code>
<du base="bool">0</du>
<tsd base="bool">0</tsd>
<etc base="bool">0</etc>
<tmc base="hex">00</tmc>
<format_linking base="hex">03</format_linking>
<parameter_length base="dec">12</parameter_length>
<description base="string">background scanning status parameter</description>
<accumulated_power_on_minutes base="dec">579578</accumulated_power_on_minutes>
<background_scanning_status base="hex">01</background_scanning_status>
<number_of_background_scans_performed base="dec">112</number_of_background_scans_performed>
<background_scan_progress base="hex">00000036</background_scan_progress>
<number_of_background_medium_scans_performed base="dec">112</number_of_background_medium_scans_performed>
</parameter>
<parameter>
<parameter_code base="hex">0001</parameter_code>
<du base="bool">0</du>
<tsd base="bool">0</tsd>
<etc base="bool">0</etc>
<tmc base="hex">00</tmc>
<format_linking base="hex">03</format_linking>
<parameter_length base="dec">20</parameter_length>
<description base="string">background medium scan parameter</description>
<accumulated_power_on_minutes base="dec">82932</accumulated_power_on_minutes>
<reassign_status base="hex">05</reassign_status>
<sense_key base="hex">01</sense_key>
<additional_sense_code base="hex">17</additional_sense_code>
<additional_sense_code_qualifier base="hex">01</additional_sense_code_qualifier>
<vendor_specific base="hex">20e2570187</vendor_specific>
<logical_block_address base="hex">00000000478994d8</logical_block_address>
</parameter>
<parameter>
<parameter_code base="hex">0002</parameter_code>
<du base="bool">0</du>
<tsd base="bool">0</tsd>
<etc base="bool">0</etc>
<tmc base="hex">00</tmc>
<format_linking base="hex">03</format_linking>
<parameter_length base="dec">20</parameter_length>
<description base="string">background medium scan parameter</description>
<accumulated_power_on_minutes base="dec">104467</accumulated_power_on_minutes>
<reassign_status base="hex">05</reassign_status>
<sense_key base="hex">01</sense_key>
<additional_sense_code base="hex">18</additional_sense_code>
<additional_sense_code_qualifier base="hex">07</additional_sense_code_qualifier>
<vendor_specific base="hex">203ab846ea</vendor_specific>
<logical_block_address base="hex">00000000133d5046</logical_block_address>
</parameter>
</background_scan_results_log_page>
</log_sense_data>
</data>
其中Parameter_code 0000将始终存在,之后可能有任意数量的parameter_codes。基本上我想从parameter_code 0000中提取2个值(开机分钟,后台扫描),以及来自parameter_code 0001和更大的大多数值,以便稍后放入数据库。我到目前为止的代码是:
import xml.etree.ElementTree as et
log_page_tree = et.fromstring(results['Data']['RawData'])
if log_page_tree.find('log_sense_data') == None:
continue
else:
for element in log_page_tree.find('log_sense_data'):
for pagecode in element.iter('page_code'):
if pagecode.text == '15':
for param in log_page_tree.find('log_sense_data').find('background_scan_results_log_page'):
for derp in param.iter():
print derp.tag, derp.text
#for totalpoweron in param.iter('accumulated_power_on_minutes'):
#print totalpoweron.text
我希望能够保留parameter_code 0000中的2个值,同时迭代其余的parameter_codes以放入数据库。任何人都可以在这里给我一个正确的方向吗?如果我指定param.iter('somevalue')来获取每个值,则代码似乎不会迭代。
答案 0 :(得分:0)
好的,虽然有一些方法可以简化/改进你的代码,但听起来你很高兴在这里:
for param in log_page_tree.find('log_sense_data').find('background_scan_results_log_page'):
这实际上会迭代每个parameter
。
但是现在你要打开parameter_code
是否0000
,在每种情况下做不同的事情。所以:
converters = {
'hex': lambda s: int(s, 16)
'dec': int,
'bool': bool
}
if param.find('parameter_code').text == '0000':
accumulated_power_on_minutes = int(param.find('accumulated_power_on_minutes').text)
number_of_background_scans_performed = int(param.find('number_of_background_scans_performed').text)
else:
obj = {}
for elem in param.getchildren():
name = elem.tag
base = elem.attrib['base']
converter = converters.get(base, lambda x: x)
value = convert(elem.text)
obj[name] = value
# do something with obj