我有一个我要解析的大型XML文件,如果满足两个值,则只打印一个特定值。
这是到目前为止的代码:
INSERT INTO
data_set
(data_set_id, annotation, value_type, text_value)
VALUES
(3, 'brand', 'strange_type', 'misstake') ;
INSERT INTO
data_set
(data_set_id, annotation, value_type, text_value)
VALUES
(4, 'brand', 'number', 'misstake') ;
这是示例xml文件:
#!/usr/local/bin/python
import xml.etree.ElementTree as ET
tree = ET.parse('onedb-dhcp.xml')
root = tree.getroot()
# This successfully gets all items in the xml:
print 'This successfully gets all items in the xml:\n'
for p in root.iter('PROPERTY'):
print p.attrib
print '\n----------------------------------------------------------'
当我运行上面的脚本时,这是我打印到屏幕上的内容(只是一个示例):
<DATABASE NAME="test" VERSION="43-39" MD5="." SCHEMA-MD5="." INT-VERSION="43-39">
<OBJECT><PROPERTY NAME="__type" VALUE="dhcp.lease"/><PROPERTY NAME="is_invalid_mac" VALUE="false"/><PROPERTY NAME="deferred_ttl" VALUE="300"/><PROPERTY NAME="ack_state" VALUE="renew"/><PROPERTY NAME="v6_prefix_bits" VALUE="0"/><PROPERTY NAME="is_ipv4" VALUE="true"/><PROPERTY NAME="vnode_id" VALUE="79"/><PROPERTY NAME="node_id" VALUE="79"/><PROPERTY NAME="ip_address" VALUE="10.10.1.6"/><PROPERTY NAME="dhcp_range" VALUE="10.10.1.5/10.10.1.254///0/"/><PROPERTY NAME="network_view" VALUE="0"/><PROPERTY NAME="starts" VALUE="2 2017/01/17 04:58:52"/><PROPERTY NAME="ends" VALUE="6 2017/01/21 04:58:52"/><PROPERTY NAME="tstp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="tsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="atsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="cltt" VALUE="2 2017/01/17 04:58:52"/><PROPERTY NAME="hardware" VALUE="00:1a:4b:26:fd:85"/><PROPERTY NAME="client_hostname" VALUE=""printer1""/><PROPERTY NAME="binding_state" VALUE="active"/><PROPERTY NAME="next_binding_state" VALUE="expired"/><PROPERTY NAME="variable" VALUE="vendor-class-identifier="Hewlett-Packard JetDirect" ddns-fwd-name="printer1.testing.net" ddns-rev-name="6.1.10.10.in-addr.arpa." ddns-txt="0015dce5883b53fa75c8d90d1312f0c054" lt="04294967295""/><PROPERTY NAME="ms_server_id" VALUE="."/><PROPERTY NAME="fingerprint" VALUE="HP Printer"/><PROPERTY NAME="fingerprint_class" VALUE="Printers"/></OBJECT>
<OBJECT><PROPERTY NAME="__type" VALUE="dhcp.lease"/><PROPERTY NAME="is_invalid_mac" VALUE="false"/><PROPERTY NAME="deferred_ttl" VALUE="300"/><PROPERTY NAME="ack_state" VALUE="from_peer"/><PROPERTY NAME="v6_prefix_bits" VALUE="0"/><PROPERTY NAME="is_ipv4" VALUE="true"/><PROPERTY NAME="vnode_id" VALUE="86"/><PROPERTY NAME="node_id" VALUE="86"/><PROPERTY NAME="ip_address" VALUE="10.10.1.44"/><PROPERTY NAME="dhcp_range" VALUE="10.10.1.5/101.10.1.254///0/"/><PROPERTY NAME="network_view" VALUE="0"/><PROPERTY NAME="starts" VALUE="2 2017/01/17 04:58:52"/><PROPERTY NAME="ends" VALUE="6 2017/01/21 04:58:52"/><PROPERTY NAME="tstp" VALUE="4 2016/06/23 19:17:54"/><PROPERTY NAME="tsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="atsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="cltt" VALUE="5 2016/06/17 19:17:54"/><PROPERTY NAME="hardware" VALUE="00:1a:4b:26:fd:85"/><PROPERTY NAME="client_hostname" VALUE=""printer2""/><PROPERTY NAME="binding_state" VALUE="active"/><PROPERTY NAME="next_binding_state" VALUE="expired"/><PROPERTY NAME="variable" VALUE="lt="345600" ddns-txt="0015dce5883b53fa75c8d90d1312f0c054" ddns-rev-name="44.1.10.10.in-addr.arpa." ddns-fwd-name="printer2.testing.net" vendor-class-identifier="Hewlett-Packard JetDirect""/><PROPERTY NAME="ms_server_id" VALUE="."/></OBJECT>
</DATABASE>
如果{'NAME': '__type', 'VALUE': 'dhcp.lease'}
{'NAME': 'is_invalid_mac', 'VALUE': 'false'}
{'NAME': 'deferred_ttl', 'VALUE': '300'}
{'NAME': 'ack_state', 'VALUE': 'renew'}
{'NAME': 'v6_prefix_bits', 'VALUE': '0'}
{'NAME': 'is_ipv4', 'VALUE': 'true'}
{'NAME': 'vnode_id', 'VALUE': '79'}
{'NAME': 'node_id', 'VALUE': '79'}
{'NAME': 'ip_address', 'VALUE': '10.10.1.6'}
= ip_address
,如何将其设置为仅打印_type
值?
我试过这个:
dhcp.lease
打印出来:
l = 'dhcp.lease'
ip = 'ip_address'
for s in root.iter('PROPERTY'):
n = s.attrib['NAME']
d = s.attrib['VALUE']
if d == l:
print s.attrib['VALUE']
我认为我接近终点线,但需要一些帮助来克服它。
答案 0 :(得分:1)
您需要先遍历所有对象。如果您找到带有&#34; dhcp.lease&#34;的属性,则打印&#34; ip_adress&#34;对象。
试试这个:
for obj in tree.iter('OBJECT'):
# Build a dictionary from NAME and VALUE of each property
properties = dict([
(p.attrib['NAME'], p.attrib['VALUE'])
for p in obj.iter('PROPERTY')
])
# Skip this object if it's not a dhcp lease
if properties['__type'] != 'dhcp.lease':
continue
print properties['ip_address']
我假设您的属性具有唯一的名称,因此我可以创建一个字典以使查找更容易。
如果您希望稍后对其进行扩展以添加更多检查,则可以在打印之前添加更多if
语句。像(不是有效的python):if properties['ends'] < now + 7 days: continue