如何使用LINQ删除具有属性的重复xml节点?

时间:2014-06-24 12:12:31

标签: xml

我有以下xml结构:

    <movie>
    <profile>
    </profile>
    <address>
    </address>
    <Details detail1="1" detail2="1">
        <moviestart>09:20:00</moviestart>
        <movietime date="2015-01-20" hour="07:05:00" />
        <code>BA</code>
        <moviearrive code="MAH" place="MAHARASHTRA" />
        <moviedepart code="JAM" place="JAMMU" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="2" detail2="2">
        <moviestart>08:00:00</moviestart>
        <movietime date="2015-01-25" hour="07:35:00" />
        <code>BI</code>
        <moviearrive code="BIH" place="Bihar" />
        <moviedepart code="MYS" place="Mysore" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="1" detail2="1">
        <moviestart>09:20:00</moviestart>
        <movietime date="2015-01-20" hour="07:05:00" />
        <code>BA</code>
        <moviearrive code="MAH" place="MAHARASHTRA" />
        <moviedepart code="JAM" place="JAMMU" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="2" detail2="2">
        <moviestart>08:00:00</moviestart>
        <movietime date="2015-01-25" hour="07:35:00" />
        <code>BI</code>
        <moviearrive code="BIH" place="Bihar" />
        <moviedepart code="MYS" place="Mysore" />
        <TYPE>STD</TYPE>
    </Details>
</movie>

在比较元素和属性的每个值之后,想要消除最后两组细节部分。我尝试使用以下代码,但没有运气。

var elements = (from el in doc.Descendants("movie").Descendants("Details")
                            select el).GroupBy(x => x.Value).Select(x => x.First());

上面的代码适用于元素值并忽略属性值。 如何在比较后删除这些重复项?

重复删除后,xml应如下所示:

<movie>
    <profile>
    </profile>
    <address>
    </address>
    <Details detail1="1" detail2="1">
        <moviestart>09:20:00</moviestart>
        <movietime date="2015-01-20" hour="07:05:00" />
        <code>BA</code>
        <moviearrive code="MAH" place="MAHARASHTRA" />
        <moviedepart code="JAM" place="JAMMU" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="2" detail2="2">
        <moviestart>08:00:00</moviestart>
        <movietime date="2015-01-25" hour="07:35:00" />
        <code>BI</code>
        <moviearrive code="BIH" place="Bihar" />
        <moviedepart code="MYS" place="Mysore" />
        <TYPE>STD</TYPE>
    </Details>
</movie>

再澄清一下, 如果xml稍微修改如下:

<movie>
    <profile>
    </profile>
    <address>
    </address>
    <Details detail1="1" detail2="1">
        <moviestart>09:20:00</moviestart>
        <movietime date="2015-01-20" hour="07:05:00" />
        <code>BA</code>
        <moviearrive code="MAH" place="MAHARASHTRA" />
        <moviedepart code="JAM" place="JAMMU" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="2" detail2="1">
        <moviestart>08:00:00</moviestart>
        <movietime date="2015-01-25" hour="07:35:00" />
        <code>BI</code>
        <moviearrive code="BIH" place="Bihar" />
        <moviedepart code="MYS" place="Mysore" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="1" detail2="2">
        <moviestart>09:20:00</moviestart>
        <movietime date="2015-01-20" hour="07:05:00" />
        <code>BA</code>
        <moviearrive code="MAH" place="MAHARASHTRA" />
        <moviedepart code="JAM" place="JAMMU" />
        <TYPE>STD</TYPE>
    </Details>
    <Details detail1="2" detail2="2">
        <moviestart>08:00:00</moviestart>
        <movietime date="2015-01-25" hour="07:35:00" />
        <code>BI</code>
        <moviearrive code="BIH" place="Bihar" />
        <moviedepart code="MYS" place="Mysore" />
        <TYPE>STD</TYPE>
    </Details>
</movie>

只有具有不同值的详细信息的属性..有任何建议吗?

1 个答案:

答案 0 :(得分:1)

您可以使用XNode.DeepEquals()检查两个节点是否具有相同的标记,例如:

var details = doc.Descendants("Details").ToList();
foreach (XElement detail in details)
{
    //get node other than current detail having same markup as detail
    var duplicate = doc.Descendants("Details")
                       .FirstOrDefault(o => o != detail && XNode.DeepEquals(o, detail));
    //if exist, current detail is duplicate -> remove it
    if(duplicate != null) detail.Remove();
}
Console.WriteLine(doc.ToString());

输出

<movie>
  <profile></profile>
  <address></address>
  <Details detail1="1" detail2="1">
    <moviestart>09:20:00</moviestart>
    <movietime date="2015-01-20" hour="07:05:00" />
    <code>BA</code>
    <moviearrive code="MAH" place="MAHARASHTRA" />
    <moviedepart code="JAM" place="JAMMU" />
    <TYPE>STD</TYPE>
  </Details>
  <Details detail1="2" detail2="2">
    <moviestart>08:00:00</moviestart>
    <movietime date="2015-01-25" hour="07:35:00" />
    <code>BI</code>
    <moviearrive code="BIH" place="Bihar" />
    <moviedepart code="MYS" place="Mysore" />
    <TYPE>STD</TYPE>
  </Details>
</movie>

另一种基于你尝试过的LINQ查询的方法,打印出相同的结果:

var elements = (from el in doc.Descendants("movie").Descendants("Details")
                select el).GroupBy(x => x.ToString())
                          .Where(x => x.Count() > 1)
                          .Select(x => x.First());
foreach (XElement element in elements)
{
    element.Remove();
}
Console.WriteLine(doc.ToString());