如何在c#中删除xml重复项

时间:2016-01-13 15:12:20

标签: c# xml

请有人帮帮我吗?我已经研究了其他帖子(例如efficiently removing duplicate xml elements in c#)如何使用c#删除XML中的重复项并更改它们来解决我的问题都无济于事。我对XML不是很有经验,我想做的就是从以下XML中删除重复项。

我已经继承了此代码,无法更改结构。

非常感谢任何可以提供帮助的人。

<Request>
    <Type>Delete</Type>
    <Client>
        <ClientId></ClientId>
        <Assignment>
            <AssignmentId></AssignmentId>
            <Assessments>
                <AssessmentId>664449ba-21b9-e511-999d-d8fc934939fe</AssessmentId>
                <AssessmentId>5ea8edd4-e1b9-e511-9af1-d8fc934939fe</AssessmentId>   
                <AssessmentId>5ea8edd4-e1b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>865a13f8-e1b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>865a13f8-e1b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>06439800-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>06439800-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>f683aa08-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>f683aa08-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>063f8012-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>063f8012-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>16f7c329-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>16f7c329-e2b9-e511-9af1-d8fc934939fe</AssessmentId>       
                <AssessmentId>76706838-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>76706838-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>86194741-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>86194741-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>66cf984f-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>66cf984f-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
            </Assessments>
        </Assignment>
    </Client>
</Request>

3 个答案:

答案 0 :(得分:0)

我更喜欢使用c#对象。 因此,您可以使用xml序列化程序将此xml反序列化为对象。您还可以在visual studio中通过xml生成c#类:编辑 - &gt;的 PasteSpecial的 - &GT; 将xml粘贴为类

您的代码如下所示:

        Request request;
        var fileName = "File1.xml";
        //Parsing
        var sr = new XmlSerializer(typeof(Request));
        using (var fs = new FileStream(fileName, FileMode.Open))
        {
            request = (Request)sr.Deserialize(fs);
        }

        //Selecting distinct C# logic
        var distinctAssignments = request.Client.Assignment.Assessments.Distinct();
        request.Client.Assignment.Assessments = distinctAssignments.ToArray();

        //Saving your document
        var xmlDocument = new XmlDocument();
        using (var stream = new MemoryStream())
        {
            sr.Serialize(stream, request);
            stream.Position = 0;
            xmlDocument.Load(stream);
            xmlDocument.Save(fileName);
            stream.Close();
        }

您也可以使用XSLT,但看起来有点复杂 - https://msdn.microsoft.com/en-us/library/bb399419(v=vs.110).aspx

答案 1 :(得分:0)

如果您可以更改构建XML的应用程序(听起来不可能),我首选的方法是使用HashSet<string>来构建Asssesments集合。如果它是SQL查询,请使用DISTINCTGROUP BY

如果您正在使用XML本身并且实际上无法更改它,那么LINQ to XML应该可以使用自定义IEqualityComparer

string xml = @"<Request>
    <Type>Delete</Type>
    <Client>
        <ClientId></ClientId>
        <Assignment>
            <AssignmentId></AssignmentId>
            <Assessments>
                <AssessmentId>664449ba-21b9-e511-999d-d8fc934939fe</AssessmentId>
                <AssessmentId>5ea8edd4-e1b9-e511-9af1-d8fc934939fe</AssessmentId>   
                <AssessmentId>5ea8edd4-e1b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>865a13f8-e1b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>865a13f8-e1b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>06439800-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>06439800-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>f683aa08-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>f683aa08-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>063f8012-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>063f8012-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>16f7c329-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>16f7c329-e2b9-e511-9af1-d8fc934939fe</AssessmentId>       
                <AssessmentId>76706838-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>76706838-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>86194741-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>86194741-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>66cf984f-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
                <AssessmentId>66cf984f-e2b9-e511-9af1-d8fc934939fe</AssessmentId>
            </Assessments>
        </Assignment>
    </Client>
</Request>";

XDocument xd = XDocument.Parse(xml);
var assessments = xd.Root.Element("Client")
                         .Element("Assignment")
                         .Element("Assessments");
// get the distinct ones
var distinctEls = assessments.Elements()
                             .Distinct(new XElComparer())
                             .ToList(); // ensure we actually get the list, not just the enumerator or elements we're about to remove

// remove all children
assessments.Elements().Remove();

// add back our distinct list
assessments.Add(distinctEls);

Console.WriteLine(xd);
Console.ReadKey();

和XElComparer:

public class XElComparer : IEqualityComparer<XElement>
{
    public bool Equals(XElement x, XElement y)
    {
        return x.Value.Equals(y.Value);
    }

    public int GetHashCode(XElement obj)
    {
        if (obj == null) return 0;

        return obj.Value.GetHashCode();
    }
}

答案 2 :(得分:0)

您可以使用简单(或不那么简单的)XPath查询来执行此操作。

XmlDocument doc = new XmlDocument();
doc.LoadFrom(xml); // xml in string form
var nodes = doc.SelectNodes("//AssessmentId[not(. = preceding-sibling::AssessmentId)]");

这将获得一个唯一的分配ID节点列表,然后您可以使用它来删除所有现有节点并添加它们。您还可以删除XPath查询中的“not”,然后您将获得一个重复列表,您也可以从父节点中删除这些节点。