我有一个非常大的xml要排序,我试图验证两个XML列表之间没有差异,但我所有的“Diff”应用程序显示了很多差异,即使我知道98%的信息都在列表。
我已经尝试了一些通过一个或多个元素对XML进行排序的方法,因此它们以相同的方式排序但没有运气,因为xml的每个“Row”都没有唯一的值可以这么说。有一个电子邮件字段,但有时电子邮件标签完全丢失,这不会使其成为排序的好字段。
看起来像这样:
<Customer>
<row CompanyID="1" Name="John" Email="John@mail.com" \>
<row CompanyID="1" Name="Jane" Email="Jane@mail.com" \>
<row CompanyID="1" Name="Howard" Email="Howard@mail.com" \>
<row CompanyID="2" Name="Jen" Email="Jen@mail.com" \>
<row CompanyID="2" Name="James" Email="James@mail.com" \>
<row CompanyID="3" Name="Phil" Email="Phil@mail.com" \>
<row CompanyID="3" Name="Kenny" \>
<row CompanyID="3" Name="Andrew" Email="Andrew@mail.com" \>
<row CompanyID="3" Name="Greg" Email="Greg@mail.com" \>
<row CompanyID="4" Name="Julia" Email="Julia@mail.com" \>
<row CompanyID="4" Name="Hannah" Email="Hannah@mail.com" \>
<row CompanyID="4" Name="Riley" Email="" \>
<row CompanyID="4" Name="Anders" Email="Anders@mail.com" \>
</Customer>
(仅用于显示目的的XML)
有没有什么好方法可以解决这个问题?
我需要的是对它们进行排序的一种好方法,或者是一种比较应用程序,它具有比较xml没有将对象顺序考虑在内的技术。
答案 0 :(得分:1)
使用Microsoft XML Diff https://msdn.microsoft.com/en-us/library/aa302294.aspx
public void GenerateDiffGram(string originalFile, string finalFile, XmlWriter diffGramWriter)
{
XmlDiff xmldiff = new XmlDiff(XmlDiffOptions.IgnoreChildOrder |
XmlDiffOptions.IgnoreNamespaces | XmlDiffOptions.IgnorePrefixes);
bool bIdentical = xmldiff.Compare(originalFile, finalFile, false, diffGramWriter);
diffgramWriter.Close();
}
如果两个文件相同,则Compare()方法返回true,否则返回false。 最后一个参数diffgramWriter是写入比较输出的地方。生成的输出是一个XML文档,记录两个文件之间的差异。以下是此场景中的样子:
public void CompareXml(string file1, string file2, string diffFileNameWithPath)
{
XmlReader reader1 = XmlReader.Create(new StringReader(file1));
XmlReader reader2 = XmlReader.Create(new StringReader(file2));
StringBuilder differenceStringBuilder = new StringBuilder();
using (FileStream fs = new FileStream(diffFileNameWithPath, FileMode.Create))
{
XmlWriter diffGramWriter = XmlWriter.Create(fs);
XmlDiff xmldiff = new XmlDiff(XmlDiffOptions.IgnoreChildOrder |
XmlDiffOptions.IgnoreNamespaces |
XmlDiffOptions.IgnorePrefixes);
bool bIdentical = xmldiff.Compare(file1, file2, false, diffGramWriter);
diffGramWriter.Close();
}
}
答案 1 :(得分:0)
我创建了一个可能有用的自定义Xml排序器:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace SortXml
{
class Program
{
const string INPUT_FILENAME = @"c:\temp\test.xml";
const string OUTPUT_FILENAME = @"c:\temp\test1.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(INPUT_FILENAME);
XmlSort xmlSort = new XmlSort();
xmlSort.SortXml(doc);
doc.Save(OUTPUT_FILENAME);
}
}
public class XmlSort : IComparer<XElement>
{
public void SortXml(XDocument doc)
{
RecursiveSort(doc.Root);
}
public void RecursiveSort(XElement elements)
{
foreach (XElement element in elements.Elements())
{
RecursiveSort(element);
}
List<XElement> children = elements.Elements().AsEnumerable().ToList();
if (children.Count > 1)
{
children.Sort(new XmlSort());
elements.ReplaceWith(new XElement(elements.Name.LocalName, children));
}
}
public int Compare(XElement a, XElement b)
{
string attributesA = string.Join("^", a.Attributes().Select(x => string.Join("^",x.Name.LocalName + (string)x)));
string hashA = string.Join("^", new string[] {a.Name.LocalName, attributesA, (string)(XElement)a.NextNode});
string attributesB = string.Join("^", b.Attributes().Select(x => string.Join("^", x.Name.LocalName + (string)x)));
string hashB = string.Join("^", new string[] { b.Name.LocalName, attributesB, (string)(XElement)b.NextNode });
int results = hashA.CompareTo(hashB);
return results;
}
}
}