如何使用C#和LINQ在XML内部提取信息?

时间:2011-04-25 20:04:18

标签: c# .net xml linq linq-to-xml

这是我在StackOverflow上的第一篇文章,所以请耐心等待。如果我的代码示例有点长,我会提前道歉。

使用C#和LINQ,我试图在更大的XML文件中识别一系列第三级id元素(在本例中为000049)。每个第三级id都是唯一的,我想要的是基于每个级别的一系列后代信息。更具体地说,如果type == Alocation type(old) == vault以及location type(new) == out,那么我想选择id。下面是我正在使用的XML和C#代码。

一般来说,我的代码有效。如下所示,它将两次返回id 000049,这是正确的。但是,我发现了一个小故障。如果我删除包含history的第一个type == A块,我的代码仍然会返回{00}的两次id,它应该只返回一次。我知道它为什么会发生,但我无法找到更好的方法来运行查询。有没有更好的方法来运行我的查询以获得我想要的输出并仍然使用LINQ?

我的XML:

<?xml version="1.0" encoding="ISO8859-1" ?>
<data type="historylist">
    <date type="runtime">
        <year>2011</year>
        <month>04</month>
        <day>22</day>
        <dayname>Friday</dayname>
        <hour>15</hour>
        <minutes>24</minutes>
        <seconds>46</seconds>
    </date>
    <customer>
        <id>0001</id>
        <description>customer</description>
        <mediatype>
            <id>kit</id>
            <description>customer kit</description>
            <volume>
                <id>000049</id>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>03</hour>
                        <minutes>00</minutes>
                        <seconds>02</seconds>
                    </date>
                    <userid>batch</userid>
                    <type>OD</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>43</minutes>
                        <seconds>33</seconds>
                    </date>
                    <userid>vaultred</userid>
                    <type>A</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>43</minutes>
                        <seconds>33</seconds>
                    </date>
                    <userid>vaultred</userid>
                    <type>S</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>45</minutes>
                        <seconds>00</seconds>
                    </date>
                    <userid>batch</userid>
                    <type>O</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>11</hour>
                        <minutes>25</minutes>
                        <seconds>59</seconds>
                    </date>
                    <userid>ihcmdm</userid>
                    <type>A</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>11</hour>
                        <minutes>25</minutes>
                        <seconds>59</seconds>
                    </date>
                    <userid>ihcmdm</userid>
                    <type>S</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
            </volume>
            ...

我的C#代码:

IEnumerable<XElement> caseIdLeavingVault =
    from volume in root.Descendants("volume")
    where
        (from type in volume.Descendants("type")
         where type.Value == "A"
         select type).Any() &&
        (from locationOld in volume.Descendants("location")
         where
             ((String)locationOld.Attribute("type") == "old" &&
              (String)locationOld.Element("repository") == "vault") &&
             (from locationNew in volume.Descendants("location")
              where
                  ((String)locationNew.Attribute("type") == "new" &&
                   (String)locationNew.Element("repository") == "out")
              select locationNew).Any()
         select locationOld).Any()
    select volume.Element("id");

    ...

foreach (XElement volume in caseIdLeavingVault)
{
    Console.WriteLine(volume.Value.ToString());
}

感谢。


好的伙计们,我又难过了。鉴于同样的情况和@ Elian的解决方案(效果很好),我需要"optime""movedate"日期用于选择history id。那有意义吗?我希望以这样的结局结束:

select new { 
    id = volume.Element("id").Value, 

    // this is from "optime"
    opYear = <whaterver>("year").Value, 
    opMonth = <whatever>("month").Value, 
    opDay = <whatever>("day").Value, 

    // this is from "movedate"
    mvYear = <whaterver>("year").Value, 
    mvMonth = <whatever>("month").Value, 
    mvDay = <whatever>("day").Value 
} 

我尝试了很多不同的组合,但Attribute<date type="optime">的{​​{1}}一直阻碍着我,我似乎无法得到我想要的东西。


行。我找到了一个效果很好的solution

<date type="movedate">

但是,如果发现select new { caseId = volume.Element("id").Value, // this is from "optime" opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value, opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value, opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value, // this is from "movedate" mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value, mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value, mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value }; 没有id,它确实会失败。其中一些存在,所以现在我正在努力。


嗯,昨天下午晚些时候,我终于找到了我一直想要的解决方案:

"movedate"

这满足了@Elian帮助的要求并获取了必要的附加日期信息。它还通过使用三元运算符var caseIdLeavingSite = from volume in root.Descendants("volume") where volume.Elements("history").Any( h => h.Element("type").Value == "A" && h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") || (l.Element("repository").Value == "init"))) && h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault") ) select new { caseId = volume.Element("id").Value, opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value, opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value, opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value, mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0", mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0", mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0" }; 来解释"movedate"没有元素的少数情况。

现在,如果有人知道如何提高效率,我仍然感兴趣。感谢。

2 个答案:

答案 0 :(得分:8)

我想你想要这样的东西:

IEnumerable<XElement> caseIdLeavingVault =
    from volume in document.Descendants("volume")
    where volume.Elements("history").Any(
        h => h.Element("type").Value == "A" &&
            h.Elements("location").Any(l => l.Attribute("type").Value == "old" && l.Element("repository").Value == "vault") &&
            h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "out")
        )
    select volume.Element("id");

您的代码会独立检查卷是否具有类型为<history>的{​​{1}}元素和具有所需A元素的(不一定相同)<history>元素。

上面的代码检查是否存在<location>类型的<history>元素,并且包含所需的A元素。

更新:Abatishchev建议使用xpath查询而不是LINQ to XML的解决方案,但他的查询过于简单,并且不能完全返回您要求的内容。下面的xpath查询可以解决这个问题,但它也有点长:

<location>

答案 1 :(得分:1)

当您可以使用简单的XPath查询时,如何使用如此复杂且昂贵的LINQ to XML查询:

using System.Xml;

string xml = @"...";
string xpath = "data/customer/mediatype/volume/history/type[text()='A']/../location[@type='old' or @type='new']/../../id";

var doc = new XmlDocument();
doc.LoadXml(xml); // or use Load(path);

var nodes = doc.SelectNodes(xpath);

foreach (XmlNode node in nodes)
{
    Console.WriteLine(node.InnerText); // 000049
}

或者如果您不需要XML DOM模型:

using System.Xml.XPath;

XPathDocument doc = null;
using (var stream = new StringReader(xml))
{
    doc = new XPathDocument(stream); // specify just path to file if you have such one
}
var nav = doc.CreateNavigator();
XPathNodeIterator nodes = (XPathNodeIterator)nav.Evaluate(xpath);
foreach (XPathNavigator node in nodes)
{
    Console.WriteLine(node.Value);
}