仅打印满足两个条件的特定xml值

时间:2017-01-22 17:54:59

标签: python xml elementtree

我有一个我要解析的大型XML文件,如果满足两个值,则只打印一个特定值。

这是到目前为止的代码:

INSERT INTO 
    data_set 
    (data_set_id, annotation, value_type, text_value)
VALUES
    (3, 'brand', 'strange_type', 'misstake') ;

INSERT INTO 
    data_set 
    (data_set_id, annotation, value_type, text_value)
VALUES
    (4, 'brand', 'number', 'misstake') ;

这是示例xml文件:

#!/usr/local/bin/python

import xml.etree.ElementTree as ET
tree = ET.parse('onedb-dhcp.xml')
root = tree.getroot()

# This successfully gets all items in the xml:

print 'This successfully gets all items in the xml:\n'

for p in root.iter('PROPERTY'):
    print p.attrib
print '\n----------------------------------------------------------'

当我运行上面的脚本时,这是我打印到屏幕上的内容(只是一个示例):

<DATABASE NAME="test" VERSION="43-39" MD5="." SCHEMA-MD5="." INT-VERSION="43-39">
<OBJECT><PROPERTY NAME="__type" VALUE="dhcp.lease"/><PROPERTY NAME="is_invalid_mac" VALUE="false"/><PROPERTY NAME="deferred_ttl" VALUE="300"/><PROPERTY NAME="ack_state" VALUE="renew"/><PROPERTY NAME="v6_prefix_bits" VALUE="0"/><PROPERTY NAME="is_ipv4" VALUE="true"/><PROPERTY NAME="vnode_id" VALUE="79"/><PROPERTY NAME="node_id" VALUE="79"/><PROPERTY NAME="ip_address" VALUE="10.10.1.6"/><PROPERTY NAME="dhcp_range" VALUE="10.10.1.5/10.10.1.254///0/"/><PROPERTY NAME="network_view" VALUE="0"/><PROPERTY NAME="starts" VALUE="2 2017/01/17 04:58:52"/><PROPERTY NAME="ends" VALUE="6 2017/01/21 04:58:52"/><PROPERTY NAME="tstp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="tsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="atsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="cltt" VALUE="2 2017/01/17 04:58:52"/><PROPERTY NAME="hardware" VALUE="00:1a:4b:26:fd:85"/><PROPERTY NAME="client_hostname" VALUE="&quot;printer1&quot;"/><PROPERTY NAME="binding_state" VALUE="active"/><PROPERTY NAME="next_binding_state" VALUE="expired"/><PROPERTY NAME="variable" VALUE="vendor-class-identifier=&quot;Hewlett-Packard JetDirect&quot; ddns-fwd-name=&quot;printer1.testing.net&quot; ddns-rev-name=&quot;6.1.10.10.in-addr.arpa.&quot; ddns-txt=&quot;0015dce5883b53fa75c8d90d1312f0c054&quot; lt=&quot;04294967295&quot;"/><PROPERTY NAME="ms_server_id" VALUE="."/><PROPERTY NAME="fingerprint" VALUE="HP Printer"/><PROPERTY NAME="fingerprint_class" VALUE="Printers"/></OBJECT>
<OBJECT><PROPERTY NAME="__type" VALUE="dhcp.lease"/><PROPERTY NAME="is_invalid_mac" VALUE="false"/><PROPERTY NAME="deferred_ttl" VALUE="300"/><PROPERTY NAME="ack_state" VALUE="from_peer"/><PROPERTY NAME="v6_prefix_bits" VALUE="0"/><PROPERTY NAME="is_ipv4" VALUE="true"/><PROPERTY NAME="vnode_id" VALUE="86"/><PROPERTY NAME="node_id" VALUE="86"/><PROPERTY NAME="ip_address" VALUE="10.10.1.44"/><PROPERTY NAME="dhcp_range" VALUE="10.10.1.5/101.10.1.254///0/"/><PROPERTY NAME="network_view" VALUE="0"/><PROPERTY NAME="starts" VALUE="2 2017/01/17 04:58:52"/><PROPERTY NAME="ends" VALUE="6 2017/01/21 04:58:52"/><PROPERTY NAME="tstp" VALUE="4 2016/06/23 19:17:54"/><PROPERTY NAME="tsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="atsfp" VALUE="1 2017/01/23 04:58:52"/><PROPERTY NAME="cltt" VALUE="5 2016/06/17 19:17:54"/><PROPERTY NAME="hardware" VALUE="00:1a:4b:26:fd:85"/><PROPERTY NAME="client_hostname" VALUE="&quot;printer2&quot;"/><PROPERTY NAME="binding_state" VALUE="active"/><PROPERTY NAME="next_binding_state" VALUE="expired"/><PROPERTY NAME="variable" VALUE="lt=&quot;345600&quot; ddns-txt=&quot;0015dce5883b53fa75c8d90d1312f0c054&quot; ddns-rev-name=&quot;44.1.10.10.in-addr.arpa.&quot; ddns-fwd-name=&quot;printer2.testing.net&quot; vendor-class-identifier=&quot;Hewlett-Packard JetDirect&quot;"/><PROPERTY NAME="ms_server_id" VALUE="."/></OBJECT>
</DATABASE>

如果{'NAME': '__type', 'VALUE': 'dhcp.lease'} {'NAME': 'is_invalid_mac', 'VALUE': 'false'} {'NAME': 'deferred_ttl', 'VALUE': '300'} {'NAME': 'ack_state', 'VALUE': 'renew'} {'NAME': 'v6_prefix_bits', 'VALUE': '0'} {'NAME': 'is_ipv4', 'VALUE': 'true'} {'NAME': 'vnode_id', 'VALUE': '79'} {'NAME': 'node_id', 'VALUE': '79'} {'NAME': 'ip_address', 'VALUE': '10.10.1.6'} = ip_address,如何将其设置为仅打印_type值?

我试过这个:

dhcp.lease

打印出来:

l = 'dhcp.lease'
ip = 'ip_address'

for s in root.iter('PROPERTY'):
        n = s.attrib['NAME']
        d = s.attrib['VALUE']
        if d == l:
                print s.attrib['VALUE']

我认为我接近终点线,但需要一些帮助来克服它。

1 个答案:

答案 0 :(得分:1)

您需要先遍历所有对象。如果您找到带有&#34; dhcp.lease&#34;的属性,则打印&#34; ip_adress&#34;对象。

试试这个:

for obj in tree.iter('OBJECT'):

    # Build a dictionary from NAME and VALUE of each property
    properties = dict([
        (p.attrib['NAME'], p.attrib['VALUE'])
        for p in obj.iter('PROPERTY')
    ])

    # Skip this object if it's not a dhcp lease
    if properties['__type'] != 'dhcp.lease':
        continue

    print properties['ip_address']

我假设您的属性具有唯一的名称,因此我可以创建一个字典以使查找更容易。

如果您希望稍后对其进行扩展以添加更多检查,则可以在打印之前添加更多if语句。像(不是有效的python):if properties['ends'] < now + 7 days: continue