Performance between check exists before add to list and distinct in linq

时间:2018-07-25 04:31:32

标签: c# performance linq

In the foreach loop, I want to add the Products to a List, but I want this List to not contain duplicate Products, currently I have two ideas solved.

1/ In the loop, before adding the Product to the List, I will check whether the Product already exists in the List, otherwise I will add it to the List.

foreach (var product in products)
{
    // code logic
    if(!listProduct.Any(x => x.Id == product.Id))
    {
        listProduct.Add(product);
    }
}

2/. In the loop, I will add all the Products to the List even if there are duplicate products. Then outside of the loop, I would use Distinct to remove duplicate records.

foreach (var product in products)
{
    // code logic
        listProduct.Add(product);
}
listProduct  = listProduct.Distinct().ToList();

I wonder in these two ways is the most effective way. Or have any other ideas to be able to add records to the List to avoid duplication ??

4 个答案:

答案 0 :(得分:2)

You first take which elements are not already in the collection:

var newProducts = products.Where(x => !listProduct.Any(y => x.Id == y.Id));

And then just add them using AddRang

listProduct.AddRagne(newItems)

Or you can use foreach loop too

foreach (var product in newProducts)
{
    listProduct.Add(product);
}

1 more easy solution could be there no need to use Distint

 var newProductList = products.Union(listProduct).ToList();

But Union has not good performance.

答案 1 :(得分:2)

我将采用第三种方法: HashSet。。它具有接受IEnumerable的构造函数重载。此构造函数删除重复项:

  

如果输入集合包含重复项,则集合将包含一个   每个独特元素。没有异常。

来源:HashSet<T> Constructor

用法:

List<Product> myProducts = ...;
var setOfProducts = new HashSet<Product>(myProducts);

删除重复项后,setOfProducts[4]没有适当的含义。

因此,HashSet不是IList<Product>,而是ICollection<Product>,您可以计算/添加/删除等,您可以对列表执行的所有操作。唯一不能做的就是按索引获取

答案 2 :(得分:1)

From what you have included, you are storing everything in memory. If this is the case, or you are persisting only after you have it ready you can consider using BinarySearch: https://msdn.microsoft.com/en-us/library/w4e7fxsh(v=vs.110).aspx and you also get an ordered list at the end. If ordering is not important, you can use HashSet, which is very fast, and meant specially for this purpose.

Check also: https://www.dotnetperls.com/hashset

答案 3 :(得分:1)

这应该非常快,并且可以处理所有订购:

// build a HashSet of your primary keys type (I'm assuming integers here) containing all your list elements' keys
var hashSet = new HashSet<int>(listProduct.Select(p => p.Id));

// add all items from the products list whose Id can be added to the hashSet (so it's not a duplicate)
listProduct.AddRange(products.Where(p => hashSet.Add(p.Id)));

不过,您可能想考虑做的是在您的Product类型上实现IEquatable<Product>并覆盖GetHashCode(),这会使上面的代码更容易一些,并进行相等性检查它们应位于的位置(在相应类型内):

var hashSet = new HashSet<int>(listProduct);
listProduct.AddRange(products.Where(hashSet.Add));