我试图找出XElement中的重复元素,并创建一个通用函数来删除重复项。类似于:
public List<Xelement>RemoveDuplicatesFromXml(List<Xelement> xele)
{ // pass the Xelement List in the Argument and get the List back , after deleting the duplicate entries.
return xele;
}
xml如下:
<Execute ID="7300" Attrib1="xyz" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
<Execute ID="7301" Attrib1="xyz" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
<Execute ID="7302" Attrib1="xyz1" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
我想在除ID之外的每个属性上获取重复项,然后删除ID较小的那个。
谢谢,
答案 0 :(得分:1)
您可以为此任务实施自定义IEqualityComparer
class XComparer : IEqualityComparer<XElement>
{
public IList<string> _exceptions;
public XComparer(params string[] exceptions)
{
_exceptions = new List<string>(exceptions);
}
public bool Equals(XElement a, XElement b)
{
var attA = a.Attributes().ToList();
var attB = b.Attributes().ToList();
var setA = AttributeNames(attA);
var setB = AttributeNames(attB);
if (!setA.SetEquals(setB))
{
return false;
}
foreach (var e in setA)
{
var xa = attA.First(x => x.Name.LocalName == e);
var xb = attB.First(x => x.Name.LocalName == e);
if (xa.Value == null && xb.Value == null)
continue;
if (xa.Value == null || xb.Value == null)
return false;
if (!xa.Value.Equals(xb.Value))
{
return false;
}
}
return true;
}
private HashSet<string> AttributeNames(IList<XAttribute> e)
{
return new HashSet<string>(e.Select(x =>x.Name.LocalName).Except(_exceptions));
}
public int GetHashCode(XElement e)
{
var h = 0;
var atts = e.Attributes().ToList();
var names = AttributeNames(atts);
foreach (var a in names)
{
var xa = atts.First(x => x.Name.LocalName == a);
if (xa.Value != null)
{
h = h ^ xa.Value.GetHashCode();
}
}
return h;
}
}
用法:
var comp = new XComparer("ID");
var distXEle = xele.Distinct(comp);
请注意,此答案中的IEqualityComparer
实施只会比较LocalName
,并且不会考虑名称空间。如果您的元素具有重复的本地名称属性,则此实现将采用第一个。
您可以在此处查看演示:https://dotnetfiddle.net/w2DteS
如果你想
删除ID较小的那个
这意味着您需要最大的ID,然后您可以使用.Distinct
链接.Select
来电。
var comp = new XComparer("ID");
var distXEle = xele
.Distinct(comp)
.Select(z => xele
.Where(a => comp.Equals(z, a))
.OrderByDescending(a => int.Parse(a.Attribute("ID").Value))
.First()
);
它将保证您获得具有最大ID的元素。
答案 1 :(得分:1)
使用Linq GroupBy
var doc = XDocument.Parse(yourXmlString);
var groups = doc.Root
.Elements()
.GroupBy(element => new
{
Attrib1 = element.Attribute("Attrib1").Value,
Attrib2 = element.Attribute("Attrib2").Value,
Attrib3 = element.Attribute("Attrib3").Value,
Attrib4 = element.Attribute("Attrib4").Value,
Attrib5 = element.Attribute("Attrib5").Value
});
var duplicates = group1.SelectMany(group =>
{
if(group.Count() == 1) // remove this if you want only duplicates
{
return group;
}
int minId = group.Min(element => int.Parse(element.Attribute("ID").Value));
return group.Where(element => int.Parse(element.Attribute("ID").Value) > minId);
});
上面的解决方案将删除具有较小ID
的元素,这些元素具有属性重复的元素
如果只想返回具有重复项的元素,则从最后一个lambda
if
fork