比较两个XML文件并将差异保存到结果文件中

时间:2013-08-19 16:09:13

标签: c# xml

我有两个XML文件需要比较差异,XML非常简单:

文件1:

<?xml version="1.0" encoding="utf-8"?>
<Feeds zone="my zone">
  <Feed name="attribDump.json">ac1f07edc491a3d237cdfb1a17fc4551</Feed>
  <Feed name="focus_GroupsKV.txt">0f9e0a14a4ffce6ff5065b6e088c1f84</Feed>
  <Feed name="NAM_FORMATTED.csv">9e875496cdb072b5e54318d51295fdba</Feed>
  <Feed name="BNP\activityTitles.txt">2d27c0f19b71b4b411bcb00011d3f8b0</Feed>
</Feeds>

和文件2:

<?xml version="1.0" encoding="utf-8"?>
<FeedsRequest version="1">
<Feeds zone="my zone">
  <Feed name="attribDump.json">ac1f07edc491a3d237cdfb1a17fc4551</Feed>
  <Feed name="focus_GroupsKV.txt">0f9e0a14a4ffce6ff5065b6e088c1f84</Feed>
  <Feed name="BNP\activityTitles.txt">e54c5b851ee3ff3f43b10d24f2316431</Feed>
</Feeds>
</FeedsRequest>

文件1是我们文件共享上的文件的清单列表,文件2由需要从文件1刷新的断开连接的设备使用。我需要进行的检查是1)确保所有提要文件1位于文件2中,2)确保找到的任何源具有相同的hashCode(长字符串)。一旦检查完成,我需要创建一个响应文件,其中包含所有提要的列表,然后每个提供ok(找到并匹配文件),缺少(找不到文件)的属性,或更新(文件被发现,但它是旧版本)。

所以基本上结果文件看起来像这样:

<?xml version="1.0" encoding="utf-8"?>
<FeedsResponse version="1">
<Feeds zone="my zone">
  <Feed name="attribDump.json" status="ok">ac1f07edc491a3d237cdfb1a17fc4551</Feed>
  <Feed name="focus_GroupsKV.txt" status="ok">0f9e0a14a4ffce6ff5065b6e088c1f84</Feed>
  <Feed name="NAM_FORMATTED.csv" status="missing">afd2c620053ed4f85ab02b4cc5f7a2b2</Feed>
  <Feed name="BNP\activityTitles.txt" status="updated">90805b851ee3ff3f43b10d24f2316431</Feed>

我目前正在做的是循环遍历文件1中的所有文件,然后针对文件2检查它们之间的差异。自从我使用XML以来,我遇到了困难,是如何构建响应文档的。

            FileInfo feedList = new FileInfo(_feedList);
        FileInfo feedRequest = new FileInfo(_feedRequest);

        // Load the documents
        XmlDocument feedListXmlDoc = new XmlDocument();
        feedListXmlDoc.Load(_feedList);

        // Load the documents
        XmlDocument feedRequestXmlDoc = new XmlDocument();
        feedRequestXmlDoc.Load(_feedRequest);

        //create response doc
        XmlDocument feedResponseXmlDoc = new XmlDocument();

        // Define a single node
        XmlNode feedListNode;
        XmlNode feedRequestNode;

        // Get the root Xml element
        XmlElement feedListRoot = feedListXmlDoc.DocumentElement;
        XmlElement feedRequestRoot = feedRequestXmlDoc.DocumentElement;

        // Get a list of all player names
        XmlNodeList feedListXml = feedListRoot.GetElementsByTagName("Feed");
        XmlNodeList feedRequestXml = feedRequestRoot.GetElementsByTagName("Feed");

        // Create an XmlWriterSettings object with the correct options. 
        XmlWriter writer = null;
        XmlWriterSettings settings = new XmlWriterSettings();
        settings.Indent = true;
        settings.IndentChars = ("  ");
        settings.OmitXmlDeclaration = false;

        // Create the XmlWriter object and write some content.
        writer = XmlWriter.Create(_resultPath, settings);
        writer.WriteStartElement("FeedsDiff");

        // The compare algorithm
        bool feedMatch = false;

        int j = 0;

        try 
        {
            // loop through list of current feeds
            for (int i = 0; i < feedListXml.Count; i++)
            {
                feedListNode = feedListXml.Item(i);

                string feedListName = feedListNode.Attributes["name"].Value.ToString();
                string feedListHash = feedListXml.Item(i).InnerText.ToString();

                //check feed request list for a match
                while (j < feedRequestXml.Count && feedMatch == false)
                {
                    feedRequestNode = feedRequestXml.Item(j);
                    string feedRequestName = feedRequestNode.Attributes["name"].Value.ToString();

                    //checks to see if feed names match
                    if (feedListName == feedRequestName)
                    {
                        feedMatch = true;
                        string feedRequestHash = feedRequestXml.Item(j).InnerText.ToString();

                        //since we found the node, we can remove it from the request list
                        XmlNode node = feedRequestNode.ParentNode;
                        node.RemoveChild(feedRequestNode);

                        //checks to see if hash codes match
                        if (feedListHash == feedRequestHash)
                        {
                            //if name and code match, move to the next one
                            feedMatch = true;

                            //add 'status="ok"' attribute to the node
                            //feedResponseXmlDoc.ImportNode(feedRequestNode,false);

                            Debug.WriteLine(feedListName + " name and hash match");

                            j = 0;
                        }
                        else 
                        {

                            feedMatch = true;

                            //feed has been updated since last device sync
                            //need to add status='update' attribute and append file to response
                            Debug.WriteLine(feedListName + " name matched but hash did not");
                        }
                    }
                    else
                    {
                        //names didn't match
                        //add status="missing" to the node
                        j++;
                    }
                }
                feedMatch = false;
            }
            // end Xml document
            writer.WriteEndElement();
            writer.Flush();
        }
        finally
        {
            if (writer != null)
                writer.Close();
        }

现在我正在尝试在循环之前实例化响应文档然后只是添加元素,因为我们很难找到一个简洁的方法来完成它。任何帮助表示赞赏。

2 个答案:

答案 0 :(得分:0)

在CodePlex上查看我的开源项目differ中的CodeBlocks,它是专为此类情况而设计的。它在Nuget上也可以作为“differ

答案 1 :(得分:0)

我明白了:

public void CompareXml(string _feedList, string _feedRequest, string _resultPath)
    {
        FileInfo feedList = new FileInfo(_feedList);
        FileInfo feedRequest = new FileInfo(_feedRequest);

        // Load the documents
        XmlDocument feedListXmlDoc = new XmlDocument();
        feedListXmlDoc.Load(_feedList);

        // Load the documents
        XmlDocument feedRequestXmlDoc = new XmlDocument();
        feedRequestXmlDoc.Load(_feedRequest);

        // Define a single node
        XmlNode feedListNode;
        XmlNode feedRequestNode;

        // Get the root Xml element
        XmlElement feedListRoot = feedListXmlDoc.DocumentElement;
        XmlElement feedRequestRoot = feedRequestXmlDoc.DocumentElement;

        // Get a list of feeds for the stored list and the request
        XmlNodeList feedListXml = feedListRoot.GetElementsByTagName("Feed");
        XmlNodeList feedRequestXml = feedRequestRoot.GetElementsByTagName("Feed");

        bool feedLocated = false;
        int j = 0;

        try 
        {
            // loop through list of current feeds
            for (int i = 0; i < feedListXml.Count; i++)
            {
                feedListNode = feedListXml.Item(i);
                //create status attribute
                XmlAttribute attr = feedListXmlDoc.CreateAttribute("status");

                string feedListName = feedListNode.Attributes["name"].Value.ToString();
                string feedListHash = feedListXml.Item(i).InnerText.ToString();

                //check feed request list for a match
                while (j < feedRequestXml.Count && feedLocated == false)
                {
                    feedRequestNode = feedRequestXml.Item(j);
                    string feedRequestName = feedRequestNode.Attributes["name"].Value.ToString();

                    //checks to see if feed names match
                    if (feedRequestName == feedListName)
                    {
                        string feedRequestHash = feedRequestXml.Item(j).InnerText.ToString();

                        //checks to see if hashCodes match
                        if (feedListHash == feedRequestHash)
                        {
                            //if name and code match, set status to ok
                            attr.Value = "ok";

                            Debug.WriteLine(feedListName + " name and hash match. Status: 'ok'");
                        }
                        else 
                        {
                            //if hashCodes don't match, set status attribute to updated
                            attr.Value = "updated";

                            Debug.WriteLine(feedListName + " name matched but hash did not. Status: 'updated'");
                        }
                        feedListNode.Attributes.Append(attr);
                        feedLocated = true;
                    }
                    else
                    {
                        //names didn't match, checking to see if we're at the end of  the request list
                        if (j + 1 == feedRequestXml.Count)
                        {
                            //file name wasn't found in the request list, set status attribute to missing
                            attr.Value = "missing";
                            feedListNode.Attributes.Append(attr);
                            feedLocated = true;
                            j = 0;

                            Debug.WriteLine("Reached the end of the file request list without a match. Status: 'missing'");
                        }
                        //file name wasn't located on this pass, move to next record
                        j++;
                    }
                }
                feedLocated = false;
            }
        }
        finally
        {
            Debug.WriteLine("Result file has been written out at " + _resultPath);
        }

        feedListXmlDoc.Save(_resultPath);
    }