Question

我正在使用List<T>，并且需要更新列表具有的对象属性。

最高效/快捷的方法是什么？我知道随着列表的增长，扫描List<T>的索引会比较慢，并且List<T>并不是更新的最有效集合。

那可悲的是，最好是：

要删除匹配对象，然后添加一个新对象？
浏览列表索引，直到找到匹配的对象，然后更新对象的属性？
如果我有一个集合，让我们使用IEnumerable，并且我想将该IEnumerable更新到列表中，那是最好的方法。

存根代码示例：

public class Product
{
    public int ProductId { get; set; }
    public string ProductName { get; set; }
    public string Category { get; set; }
}

public class ProductRepository
{
    List<Product> product = Product.GetProduct();
    public void UpdateProducts(IEnumerable<Product> updatedProduct)
    {
    }
    public void UpdateProduct(Product updatedProduct)
    {
    }
}

Answer 1

如果要快速查找，可以考虑使用Dictionary而不是List。在您的情况下，它将是乘积ID（我认为是唯一的）。 Dictionary MSDN

例如：

public class ProductRepository
    {
        Dictionary<int, Product> products = Product.GetProduct();
        public void UpdateProducts(IEnumerable<Product> updatedProducts)
        {
            foreach(var productToUpdate in updatedProducts)
            {
                UpdateProduct(productToUpdate);
            }

            ///update code here...
        }
        public void UpdateProduct(Product productToUpdate)
        {
            // get the product with ID 1234 
            if(products.ContainsKey(productToUpdate.ProductId))
            {
                var product = products[productToUpdate.ProductId];
                ///update code here...
                product.ProductName = productToUpdate.ProductName;
            }
            else
            {
                //add code or throw exception if you want here.
                products.Add(productToUpdate.ProductId, productToUpdate);
            }
        }
    }

Answer 2

效率到底是什么？

除非实际上有成千上万的项目在进行foreach，否则for或任何其他类型的循环操作很可能仅显示毫秒数的差异。真？因此，您浪费了更多时间（最好的程序员每小时花费XX美元，而不是最终用户的花费）来寻找最佳的时间。

因此，如果您实际上有成千上万的记录，我建议通过使用Parallel.Foreach方法并行处理列表来提高效率，该方法可以处理更多记录以节省线程开销。

恕我直言，如果记录数大于100，则表示正在使用一个数据库。如果涉及数据库，则编写一个更新程序并一天调用一次；我很难编写一个一次性的程序来完成特定的更新，而该更新可以在所述数据库中以更简单的方式完成。

Answer 3

您的用例正在更新List<T>，其中可以包含数百万条记录，并且更新后的记录可以是子列表，也可以只是一条记录

以下是架构：

public class Product
{
    public int ProductId { get; set; }
    public string ProductName { get; set; }
    public string Category { get; set; }
}

Product是否包含主键，这意味着每个Product对象都可以唯一标识，并且没有重复项，并且每个更新都针对单个唯一记录？

如果是，则最好以List<T>的形式安排Dictionary<int,T>，这意味着IEnumerable<T>的每次更新都是O(1)的时间复杂度，这意味着可以根据IEnumerable<T>的大小来完成所有更新，我不认为它会很大，尽管会为不同的数据分配额外的内存结构是必需的，但这将是一个非常快速的解决方案。@ JamieLupton已经在类似的生产线上提供了解决方案

如果重复Product，则没有主键，则上述解决方案无效，然后，理想的扫描方式List<T>是二进制搜索，其时间复杂度为O(logN)

现在，由于IEnumerable<T>的大小相对较小（例如M），因此总体时间复杂度为O(M*logN)，其中M比N小得多，可以忽略。

List<T>支持Binary Search API，该API提供了元素索引，然后可以使用该索引来更新相关索引处的对象，请在此处example中进行检查

对于我来说，如此大量记录的最佳选择是与二进制搜索一起进行并行处理

现在，由于线程安全是一个问题，所以我通常要做的是将List<T>划分为List<T>[]，因为随后可以将每个单元分配给单独的线程，所以一种简单的方法是使用{{ 1}}批处理Api，您可以在其中使用MoreLinq获取系统处理器的数量，然后按以下方式创建Environment.ProcessorCount：

IEnumerable<IEnumerable<T>>

另一种方法是遵循自定义代码：

var enumerableList = List<T>.Batch(Environment.ProcessorCount).ToList();

现在，您可以创建public static class MyExtensions { // data - List<T> // dataCount - Calculate once and pass to avoid accessing the property everytime // Size of Partition, which can be function of number of processors public static List<T>[] SplitList<T>(this List<T> data, int dataCount, int partitionSize) { int remainderData; var fullPartition = Math.DivRem(dataCount, partitionSize, out remainderData); var listArray = new List<T>[fullPartition]; var beginIndex = 0; for (var partitionCounter = 0; partitionCounter < fullPartition; partitionCounter++) { if (partitionCounter == fullPartition - 1) listArray[partitionCounter] = data.GetRange(beginIndex, partitionSize + remainderData); else listArray[partitionCounter] = data.GetRange(beginIndex, partitionSize); beginIndex += partitionSize; } return listArray; } }，在上面生成的Task[]上为每个元素Task分配每个List<T>，然后对每个子分区进行二进制搜索。尽管它具有重复性，但将使用并行处理和二进制搜索的功能。每个List<T>[]都可以启动，然后我们可以使用Task等待任务处理完成

除此之外，如果您要创建Task.WaitAll(taskArray)并使用并行处理，那将是最快的。

可以使用Linq Dictionary<int,T>[]或List<T>[]如下完成List<T>与Aggregation的最终集成：

SelectMany

另一种选择是将List<T>[] splitListArray = Fetch splitListArray; // Process splitListArray var finalList = splitListArray.SelectMany(obj => obj).ToList()与线程安全的数据结构（例如Parallel.ForEach）一起使用，或者在替换完整对象的情况下可以使用ConcurrentBag<T>，但是如果其属性更新，则使用简单的ConcurrentDictionary<int,T>就可以了。 List<T>内部使用范围分区器，类似于我上面建议的

上述解决方案理想情况下取决于您的用例，您将能够组合使用以获得最佳结果。让我知道，如果您需要特定的示例

C＃通用列表<t>更新项目

3 个答案: