为linq和C#删除XML中的重复项

时间:2013-10-21 17:05:26

标签: .net xml linq duplicates

首先,我感谢大家对我的问题的耐心。 我在这里搜索 Why doesn't this code find any duplicates within an xml element?remove a duplicate element(with specific value) from xml using linq 并且很接近,但没有得到它。

我需要删除XML中的重复元素。这些元素可能存在也可能不存在

XML片段如下。需要删除重复的BuildNumber元素。

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<ProductSessions FileID="{C7DCB747-AB3A-4222-B14B-F7A7994C212F}">
    <Session LicenceNumber="E2240A66AC64CB770000" SessionGuid="{20c5d49e-7442-4fd0-b612-23aa743f4bd9}" FK_FileId="{C7DCB747-AB3A-4222-B14B-F7A7994C212F}">
      <TimeOpened>2013/10/14 11:18:43</TimeOpened>
      <LicenseInfo Configuration="XYZ" Description="Company Standard Config+More" DongleID="-error-no-dongle-" LicenseKey="FLEXlm Server Licence" Licensed="Company  USA" FK_Sess  ionGuid="{20c5d49e-7442-4fd0-b612-23aa743f4bd9}" />
    <ProductVersion>Product 9.0.0 NTx86-64 (build 987)</ProductVersion>
      <BuildNumber>987</BuildNumber>
      <ProductArchitecture>NTx86-64</ProductArchitecture>
      <ProductVersion>9.0.0</ProductVersion>
      <SystemInfo OperativeSystem="Microsoft Windows 8 Enterprise Edition (build 9200) 64-bit" User=" " FK_SessionGuid="{20c5d49e-7442-4fd0-b612-23aa743f4bd9}" />
      <ApplicationName>X</ApplicationName>
      <TimeClosed>2013/10/14 11:42:57</TimeClosed>
</Session>
<Session LicenceNumber="E2240A66AC64CB770000" SessionGuid="{5682f705-baa1-46c0-a5ca-    3c6d816c94cc}" FK_FileId="{C7DCB747-AB3A-4222-B14B-F7A7994C212F}">
      <TimeOpened>2013/10/14 11:55:23</TimeOpened>
      <LicenseInfo Configuration="XYZ" Description="Company Standard Config+More" DongleID="-error-no-dongle-" LicenseKey="FLEXlm Server Licence" Licensed="Company  USA" FK_SessionGuid="{5682f705-baa1-46c0-a5ca-3c6d816c94cc}" />
      <ProductVersion>Product 8.2.x NTx86-64 (build 123)</ProductVersion>
      <BuildNumber>123</BuildNumber>
      <BuildNumber>123</BuildNumber>
      <BuildNumber>123</BuildNumber>
      <ProductArchitecture>NTx86-64</ProductArchitecture>
      <ProductVersion>8.2.x</ProductVersion>
      <SystemInfo OperativeSystem="Microsoft Enterprise Edition (build 9200) 64-bit" User=" " FK_SessionGuid="{5682f705-baa1-46c0-a5ca-3c6d816c94cc}" />
      <ApplicationName>X</ApplicationName>
      <TimeClosed>2013/10/14 11:58:20</TimeClosed>
    </Session>

}

我的代码如下

// This gets the correct # of sessions
IEnumerable<XElement> childElements =
from element in XmlFile.Elements().Descendants("Session")
select element;
foreach (XElement el in childElements)
{
var dups = XmlFile.Descendants(el.n).GroupBy(e =>      e.Descendants("BuildNumber").First().ToString());
//remove the duplicates
foreach (XElement ele in dups.SelectMany(g => g.Skip(1)))
ele.Remove();

有人能指出我正确的方向吗?

2 个答案:

答案 0 :(得分:2)

var xDoc = XDocument.Load("Input.xml");

var duplicates = xDoc.Root
                     .Elements("Session")
                     .SelectMany(s => s.Elements("BuildNumber")
                                       .GroupBy(b => (int)b)
                                       .SelectMany(g => g.Skip(1)))
                     .ToList();

foreach (var item in duplicates)
    item.Remove();

或使用IEnumerable<XNode>.Remove()扩展方法:

xDoc.Root.Elements("Session")
         .SelectMany(s => s.Elements("BuildNumber")
                           .GroupBy(b => (int)b)
                           .SelectMany(g => g.Skip(1))).Remove();

答案 1 :(得分:0)

XmlFile.Descendants("Session")
       .SelectMany(s => s.Elements("BuildNumber").Skip(1))
       .Remove();

此查询从每个会话中选择除第一个BuldNumber元素之外的所有元素并将其删除。因此,只有第一个BuildNumber元素将保留在每个Session元素中。