用python解析xml

时间:2015-03-02 11:06:43

标签: python xml parsing

我试图用Python解析XML文档,这是我的代码:

from xml.dom import minidom

xmldoc = minidom.parse("aula.xml")

hosts = xmldoc.getElementsByTagName("host")

for host in hosts:
    address = host.getElementByTag("address")
    ip = address.attributes["addr"]
    IP = ip.value
    print("IP:%s"%(IP))

这让我回报:

Traceback (most recent call last):
File "simple.py", line 8, in <module>
address = host.getElementByTag("address")
AttributeError: Element instance has no attribute 'getElementByTag'

XML文件:

<?xml version="1.0"?>
<!DOCTYPE nmaprun>
<?xml-stylesheet href="file:///usr/local/bin/../share/nmap/nmap.xsl" type="text/xsl"?>
<!-- Nmap 6.47 scan initiated Thu Feb 26 11:38:24 2015 as: nmap -oX aula.xml x.x.x.x/26 -->
<nmaprun scanner="nmap" args="nmap -oX aula.xml x.x.x.x/26" start="1424947104" startstr="Thu Feb 26 11:38:24 2015" version="6.47" xmloutputversion="1.04">
  <scaninfo type="connect" protocol="tcp" numservices="1000" services="a lot of numbers"/>
  <verbose level="0"/>
  <debugging level="0"/>
  <host starttime="1424947104" endtime="1424947111"><status state="up" reason="conn-refused" reason_ttl="0"/>
    <address addr="x.x.x.x" addrtype="ipv4"/>
    <hostnames>
    </hostnames>
    <ports>
        <extraports state="closed" count="998">
            <extrareasons reason="conn-refused" count="998"/>
        </extraports>
        <port protocol="tcp" portid="22"><state state="open" reason="syn-ack" reason_ttl="0"/><service name="ssh" method="table" conf="3"/></port>
        <port protocol="tcp" portid="111"><state state="open" reason="syn-ack" reason_ttl="0"/><service name="rpcbind" method="table" conf="3"/></port>
    </ports>
    <times srtt="1280" rttvar="264" to="100000"/>
</host>
</nmaprun>

3 个答案:

答案 0 :(得分:0)

可以有多个address。所以使用

address = host.getElementsByTagName("address")[0]

获取第一个address

答案 1 :(得分:0)

您要查找的方法是getElementsByTagName,而不是getElementByTag。返回值是匹配的标记列表,因此您必须遍历它。例如:

from xml.dom import minidom

xmldoc = minidom.parse("aula.xml")

hosts = xmldoc.getElementsByTagName("host")

for host in hosts:
    addresses = host.getElementsByTagName("address")
    for address in addresses:
        ip = address.attributes["addr"]
        IP = ip.value
        print("IP:%s"%(IP))

答案 2 :(得分:0)

  1. 没有getElementByTag方法::没有这样的方法。使用getElementsByTagName()方法获取address标记列表。
  2. 检查列表是否包含项目,然后从地址标记中获取addr属性值。 3。
  3. 演示:

    from xml.dom import minidom
    
    p = '/home/vivek/Desktop/Work/input_minidom.xml'
    xmldoc = minidom.parse(p)
    
    hosts = xmldoc.getElementsByTagName("host")
    
    for host in hosts:
        address = host.getElementsByTagName("address")
        if address:
            ip = address[0].attributes["addr"]
            IP = ip.value
            print("IP:%s"%(IP))
    

    输出:

    $ python task4.py 
    IP:x.x.x.x