我使用以下代码根据一个字段(DataTable
)的值删除keyField
中的重复行
IEnumerable<DataRow> uniqueContacts = dt.AsEnumerable()
.GroupBy(x => x[keyField].ToString())
.Select(g => g.First());
DataTable dtOut = uniqueContacts.CopyToDataTable();
如何升级此代码,以便我的LINQ根据字段列表的值删除重复项。例如删除所有具有相同“名字”和“姓氏”的行?
答案 0 :(得分:2)
您可以使用匿名类型:
IEnumerable<DataRow> uniqueContacts = dt.AsEnumerable()
.GroupBy(row => new {
FirstName = row.Field<string>("FirstName"),
LastName = row.Field<string>("LastName")
})
.Select(g => g.First());
由于您需要一个适用于编译时未知的List<string>
的解决方案,您可以使用此类:
public class MultiFieldComparer : IEquatable<IEnumerable<object>>, IEqualityComparer<IEnumerable<object>>
{
private IEnumerable<object> objects;
public MultiFieldComparer(IEnumerable<object> objects)
{
this.objects = objects;
}
public bool Equals(IEnumerable<object> x, IEnumerable<object> y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(IEnumerable<object> objects)
{
unchecked
{
int hash = 17;
foreach (object obj in objects)
hash = hash * 23 + (obj == null ? 0 : obj.GetHashCode());
return hash;
}
}
public override int GetHashCode()
{
return GetHashCode(this.objects);
}
public override bool Equals(object obj)
{
MultiFieldComparer other = obj as MultiFieldComparer;
if (other == null) return false;
return this.Equals(this.objects, other.objects);
}
public bool Equals(IEnumerable<object> other)
{
return this.Equals(this.objects, other);
}
}
此扩展方法使用此类:
public static IEnumerable<DataRow> RemoveDuplicates(this IEnumerable<DataRow> rows, IEnumerable<string> fields)
{
return rows
.GroupBy(row => new MultiFieldComparer(fields.Select(f => row[f])))
.Select(g => g.First());
}
然后很简单:
List<string> columns = new List<string> { "FirstName", "LastName" };
var uniqueContacts = dt.AsEnumerable().RemoveDuplicates(columns).CopyToDataTable();