如何使用linq to xml从具有相同元素和属性的复杂xml中提取数据

时间:2013-01-27 15:35:08

标签: c# xml-parsing linq-to-xml

这是我第一次将Linq用于xml,而我正在努力从xml文件中提取一些数据。 问题似乎是由于xml的格式化方式(这是我无法控制的),因为它具有相同的元素和属性。

<host starttime="1357755777" endtime="1357755993">
    <status state="up" reason="arp-response"/>
    <address addr="192.168.1.1" addrtype="ipv4"/>
    <address addr="00:50:56:90:77:9F" addrtype="mac" vendor="VMware"/>
    <hostnames>
        <hostname name="test1.test.com" type="PTR"/>
    </hostnames>
    <ports>
        <extraports state="closed" count="95">
            <extrareasons reason="resets" count="95"/>
        </extraports>
        <port protocol="tcp" portid="135">
            <state state="open" reason="syn-ack" reason_ttl="128"/>
            <service name="msrpc" product="Microsoft Windows RPC" ostype="Windows" method="probed" conf="10">
                <cpe>cpe:/o:microsoft:windows</cpe>
            </service>
        </port>
        <port protocol="tcp" portid="139">
            <state state="open" reason="syn-ack" reason_ttl="128"/>
            <service name="netbios-ssn" method="probed" conf="10"/>
        </port>
        <port protocol="tcp" portid="445">
            <state state="open" reason="syn-ack" reason_ttl="128"/>
            <service name="microsoft-ds" product="Microsoft Windows 2003 or 2008 microsoft-ds" ostype="Windows" method="probed" conf="10">
                <cpe>cpe:/o:microsoft:windows</cpe>
            </service>
        </port>
        <port protocol="tcp" portid="3389">
            <state state="open" reason="syn-ack" reason_ttl="128"/>
            <service name="ms-wbt-server" product="Microsoft Terminal Service" ostype="Windows" method="probed" conf="10"/>
        </port>
        <port protocol="tcp" portid="8081">
            <state state="open" reason="syn-ack" reason_ttl="128"/>
            <service name="http" product="Network Associates ePolicy Orchestrator" method="probed" conf="10"/>
        </port>
    </ports>
</host>
<host starttime="1357755777" endtime="1357755993">
    <status state="up" reason="arp-response"/>
    <address addr="192.168.1.2" addrtype="ipv4"/>
    <address addr="00:50:56:90:67:8F" addrtype="mac" vendor="VMware"/>
    <hostnames>
        <hostname name="test2.test.com" type="PTR"/>
    </hostnames>
    <ports>
        <extraports state="closed" count="97">
            <extrareasons reason="resets" count="97"/>
        </extraports>
        <port protocol="tcp" portid="53">
            <state state="open" reason="syn-ack" reason_ttl="64"/>
            <service name="domain" product="dnsmasq" version="2.33" method="probed" conf="10">
                <cpe>cpe:/a:thekelleys:dnsmasq:2.33</cpe>
            </service>
            <script id="dns-nsid" output="&#xa;  bind.version: dnsmasq-2.33&#xa;"/>
        </port>
        <port protocol="tcp" portid="81">
            <state state="open" reason="syn-ack" reason_ttl="64"/>
            <service name="http" product="Apache httpd" method="probed" conf="10">
                <cpe>cpe:/a:apache:http_server</cpe>
            </service>
            <script id="http-title" output="Did not follow redirect to https://192.168.100.14:445/ and no page was returned."/>
            <script id="http-favicon" output="Unknown favicon MD5: 95CDE3E49C5B2645F99AAAAABB6CD4C6"/>
            <script id="http-methods" output="No Allow or Public header in OPTIONS response (status code 403)"/>
        </port>
        <port protocol="tcp" portid="445">
            <state state="open" reason="syn-ack" reason_ttl="64"/>
            <service name="http" product="Apache httpd" method="probed" conf="10">
                <cpe>cpe:/a:apache:http_server</cpe>
            </service>
            <script id="http-title" output="400 Bad Request"/>
            <script id="http-methods" output="No Allow or Public header in OPTIONS response (status code 403)"/>
        </port>
    </ports>
</host>
上面的

是我必须使用的XML示例 我为这个问题简化了一点,输出来自nmap。

我需要的XML数据如下。 对于每个主机 状态/状态 addrtype ipv4的地址/地址 addrtype mac的地址/地址和地址/供应商 每个port / portid的

    XDocument NmapScan = XDocument.Load(file);

    var data = from item in NmapScan.Descendants("host")
         select new 
         {
             status = item.Element("status").Attribute("state").Value,
             ip = item.Element("address").Attribute("addr").Value,
             iptype = item.Element("address").Attribute("addrtype").Value
         };

                foreach (var p in data)
                    Debug.WriteLine(p.ToString());

我发现的每一个教程都没有看到为这种类型的XML进行此操作。 我可以得到每种类型的第一个条目,但不是第二个条目。 我还没能找到迭代它们的方法。 我想要的是输出

status = up,ip = 192.168.100.171,iptype = ipv4,port = 22,port = 80

2 个答案:

答案 0 :(得分:0)

您可以在原始选择中进行新选择。请参阅How to deserialize only part of a large xml file to c# classes?

的答案中的示例

答案 1 :(得分:0)

使用LINQ:

foreach (XElement hostElement in NmapScan.Descendants("host"))
{
    // Gets the XElement "address" that have the attribute "addrtype" set to "mac"
    XElement macAddressElement = (from addressElement in hostElement.Elements("address")
                                    where addressElement.Attribute("addrtype").Value == "mac"
                                    select addressElement).Single();

    // Gets the XElement "address" that have the attribute "addrtype" set to "ipv4"
    XElement ipV4AdressElement = (from addressElement in hostElement.Elements("address")
                                    where addressElement.Attribute("addrtype").Value == "ipv4"
                                    select addressElement).Single();

    var p = new
    {
        status = hostElement.Element("status").Attribute("state").Value,
        addrIpv4 = ipV4AdressElement.Attribute("addr").Value,
        addrMac = macAddressElement.Attribute("addr").Value,
        addrVendor = macAddressElement.Attribute("vendor").Value,
        ports = (from portElement in hostElement.Element("ports").Elements("port")
                    select portElement.Attribute("portid").Value).ToList()
    };

    Console.WriteLine(p);
}