Python解析NMAP XML输出“elem key =”NodeList

时间:2015-10-20 01:25:20

标签: python xml parsing nmap nodelist

我正在尝试解析NMAP xml文件中的特定值。 xml文件的部分如下所示:

<nmaprun scanner="nmap" args="nmap -A -P0 -oA scanoutput 192.168.1.5" start="1445258532" startstr="Mon Oct 19 08:42:12 2015" version="6.47" xmloutputversion="1.04">
    <hostscript>
        <script id="smb-os-discovery" output="&#10;  OS: Windows Server 2008 R2 Standard 7601 Service Pack 1 (Windows Server 2008 R2 Standard 6.1)&#10;  OS CPE: cpe:/o:microsoft:windows_server_2008::sp1&#10;  Computer name: SOMEHOSTNAME&#10;  NetBIOS computer name: SOMEHOSTNAME&#10;  Domain name: domain.local&#10;  Forest name: domain.local&#10;  FQDN: SOMEHOSTNAME.domain.local&#10;  System time: 2015-10-19T08:50:07-04:00&#10;">
            <elem key="os">Windows Server 2008 R2 Standard 7601 Service Pack 1</elem>
            <elem key="lanmanager">Windows Server 2008 R2 Standard 6.1</elem>
            <elem key="server">SOMEHOSTNAME\x00</elem>
            <elem key="date">2015-10-19T08:50:07-04:00</elem>
            <elem key="fqdn">SOMEHOSTNAME.domain.local</elem>
            <elem key="domain_dns">domain.local</elem>
            <elem key="forest_dns">domain.local</elem>
            <elem key="workgroup">HOME\x00</elem>
            <elem key="cpe">cpe:/o:microsoft:windows_server_2008::sp1</elem>
        </script>
    </hostscript>
</nmaprun>

我正在尝试从每个键获取值,但不确定如何解决它。例如,如何从 elem key =“os”中获取值?到目前为止,我可以获得完整的输出,但是当我将其添加到CSV中时,它会变得混乱,我需要将每个值分开。这是我的代码:

serveros = [script.getAttribute('output') for script in hosttag.getElementsByTagName('script') if script.getAttribute('id') == 'smb-os-discovery']

如果我将其更改为:

serveros = [script.getElementsByTagName('os') for script in hosttag.getElementsByTagName('script') if script.getAttribute('id') == 'smb-os-discovery']

我收到此错误:

TypeError: sequence item 0: expected string, NodeList found

提前致谢!

1 个答案:

答案 0 :(得分:0)

也许是这样的:

class ScriptResult:
    def __init__( self, script, port_number ):
        self.port_number = port_number
        for k,v in script.attrib.iteritems():
            self.__dict__[k] = v
        return

    def __str__( self ):
        d = '\n'
        for k,v in self.__dict__.iteritems():
            d += '    %-30s :   %s\n' % (k,v)
        return "ScriptResult(%s)\n" % d

class Host:
    def __init__( self ):
        self.script_results = []   # define list of script results
        return

    def print_results( self ):
        for i in self.script_results:
            print i
        return


class XML_Parser:

    def get_hostscripts( self, host, xml_host_element ):
        for hs in xml_host_element.findall('hostscript'):
            for s in hs.findall('script'):
                host.script_results.append( ScriptResult( s, 'host' ) )