Question

我有以下扩展方法将List<T>拆分为具有不同块大小的List<T>列表，但我怀疑它的效率。我可以做些什么来改善它，或者它是否很好？

public static List<List<T>> Split<T>(this List<T> source, params int[] chunkSizes)
{
    int totalSize = chunkSizes.Sum();
    int sourceCount = source.Count();
    if (totalSize > sourceCount)
    {
        throw new ArgumentException("Sum of chunk sizes is larger than the number of elements in source.", "chunkSizes");
    }

    List<List<T>> listOfLists = new List<List<T>>(chunkSizes.Length);
    int index = 0;
    foreach (int chunkSize in chunkSizes)
    {
        listOfLists.Add(source.GetRange(index, chunkSize));
        index += chunkSize;
    }

    // Get the entire last part if the total size of all the chunks is less than the actual size of the source
    if (totalSize < sourceCount)
    {
        listOfLists.Add(source.GetRange(index, sourceCount - totalSize));
    }

    return listOfLists;
}

示例代码用法：

List<int> list = new List<int> { 1,2,4,5,6,7,8,9,10,12,43,23,453,34,23,112,4,23 };
var result =  list.Split(2, 3, 3, 2, 1, 3);
Console.WriteLine(result);

这得到了一个期望的结果，并且最终列表部分有4个数字，因为总块大小比我列表的大小小4。

我特别怀疑GetRange部分，因为我担心这只是一遍又一遍地枚举相同的来源......

编辑：我想我知道一种枚举源的方法：只需对源本身做一个foreach，并继续检查迭代元素的数量是否与当前的chunksize相同。如果是这样，添加新列表并转到下一个chunksize。想法？

Answer 1

此代码没有性能问题。 GetRange被记录为O（chunkSize），这也很容易推断，因为List<T>的一个最重要的属性正是它允许O（1）索引。

也就是说，您可以编写更多LINQ-y版本的代码：

var rangeStart = 0;
var ranges = chunkSizes.Select(n => Tuple.Create((rangeStart += n) - n, n))
                       .ToArray();
var lists = ranges.Select(r => source.GetRange(r.Item1, r.Item2)).ToList();

if (rangeStart < source.Count) {
    lists.Last().AddRange(source.Skip(rangeStart));
}

return lists;

Answer 2

我建议使用此扩展方法将源列表按指定的块大小分块到子列表：

using System.Collections.Generic;
using System.Linq;

...

/// <summary>
/// Helper methods for the lists.
/// </summary>
public static class ListExtensions
{
    public static List<List<T>> ChunkBy<T>(this List<T> source, int chunkSize) 
    {
        return source
            .Select((x, i) => new { Index = i, Value = x })
            .GroupBy(x => x.Index / chunkSize)
            .Select(x => x.Select(v => v.Value).ToList())
            .ToList();
    }
}

如何有效地拆分具有不同块大小的List？

2 个答案: