将集合拆分为n个部分并未提供所需的结果序列

时间:2015-04-01 10:30:09

标签: .net vb.net linq ienumerable

我试图将一个集合拆分为特定数量的部分,我已经采取了一些帮助,在StackOverflow上找到解决方案:Split a collection into `n` parts with LINQ?

这是@Hasan Khan解决方案的VB.Net翻译:

''' <summary>
''' Splits an <see cref="IEnumerable(Of T)"/> into the specified amount of secuences.
''' </summary>
Public Shared Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
                                            ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))

    Dim i As Integer = 0

    Dim splits As IEnumerable(Of IEnumerable(Of T)) =
                 From item As T In col
                 Group item By item = Threading.Interlocked.Increment(i) Mod amount
                 Into Group
                 Select Group.AsEnumerable()

    Return splits


End Function

这是我对@ manu08解决方案的VB.Net翻译:

''' <summary>
''' Splits an <see cref="IEnumerable(Of T)"/> into the specified amount of secuences.
''' </summary>
Public Shared Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
                                            ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))

    Return col.Select(Function(item, index) New With {index, item}).
               GroupBy(Function(x) x.index Mod amount).
               Select(Function(x) x.Select(Function(y) y.item))

End Function

问题是两个函数都返回错误的结果。

因为如果我拆分这样的集合:

Dim mainCol As IEnumerable(Of Integer) = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Dim splittedCols As IEnumerable(Of IEnumerable(Of Integer)) =
    SplitIntoParts(col:=mainCol, amount:=2)

这两个函数都给出了这个结果:

1: { 1, 3, 5, 7, 9 }
2: { 2, 4, 6, 8, 10 }

而不是这些secuences:

1: { 1, 2, 3, 4, 5 } 
2: { 6, 7, 8, 9, 10 }

我做错了什么?。

3 个答案:

答案 0 :(得分:3)

MyExtensions类有两个公共拆分方法:

  1. 对于ICollection - 仅对集合进行一次 - 以进行拆分。
  2. 对于IEnumerable - 遍历可枚举两次:对于计数项目和拆分它们。如果可能,请不要使用它(第一个是安全的,速度快两倍)。
  3. 更多信息:此算法正在尝试恢复完全指定数量的集合

    public static class MyExtensions
    {
        // Works with ICollection - iterates through collection only once.
        public static IEnumerable<IEnumerable<T>> Split<T>(this ICollection<T> items, int count)
        {
            return Split(items, items.Count, count);
        }
    
        // Works with IEnumerable and iterates items TWICE: first for count items, second to split them.
        public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, int count)
        {            
            // ReSharper disable PossibleMultipleEnumeration
            var itemsCount = items.Count();
            return Split(items, itemsCount, count);
            // ReSharper restore PossibleMultipleEnumeration
        }
    
        private static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, int itemsCount, int partsCount)
        {
            if (items == null)
                throw new ArgumentNullException("items");
            if (partsCount <= 0)
                throw new ArgumentOutOfRangeException("partsCount");
    
            var rem = itemsCount % partsCount;
            var min = itemsCount / partsCount;
            var max = rem != 0 ? min + 1 : min;
    
            var index = 0;
            var enumerator = items.GetEnumerator();
    
            while (index < itemsCount)
            {
                var size = 0 < rem-- ? max : min;
                yield return SplitPart(enumerator, size);
                index += size;
            }
        }
    
        private static IEnumerable<T> SplitPart<T>(IEnumerator<T> enumerator, int count)
        {
            for (var i = 0; i < count; i++)
            {
                if (!enumerator.MoveNext())
                    break;
                yield return enumerator.Current;
            }            
        }
    }
    

    示例程序:

    var items = new [] {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'};
    
    for(var i = 1; i <= items.Length + 3; i++)
    {
        Console.WriteLine("{0} part(s)", i);
        foreach (var part in items.Split(i))
            Console.WriteLine(string.Join(", ", part));
        Console.WriteLine();
    }
    

    ......以及该程序的输出:

    1 part(s)
    a, b, c, d, e, f, g, h, i, j
    
    2 part(s)
    a, b, c, d, e
    f, g, h, i, j
    
    3 part(s)
    a, b, c, d
    e, f, g
    h, i, j
    
    4 part(s)
    a, b, c
    d, e, f
    g, h
    i, j
    
    5 part(s)
    a, b
    c, d
    e, f
    g, h
    i, j
    
    6 part(s)
    a, b
    c, d
    e, f
    g, h
    i
    j
    
    7 part(s)
    a, b
    c, d
    e, f
    g
    h
    i
    j
    
    8 part(s)
    a, b
    c, d
    e
    f
    g
    h
    i
    j
    
    9 part(s)
    a, b
    c
    d
    e
    f
    g
    h
    i
    j
    
    10 part(s)
    a
    b
    c
    d
    e
    f
    g
    h
    i
    j
    
    11 part(s) // Only 10 items in collection.
    a
    b
    c
    d
    e
    f
    g
    h
    i
    j
    
    12 part(s) // Only 10 items in collection.
    a
    b
    c
    d
    e
    f
    g
    h
    i
    j
    
    13 part(s)  // Only 10 items in collection.
    a
    b
    c
    d
    e
    f
    g
    h
    i
    j
    

答案 1 :(得分:1)

你没有做错事;只是你使用的方法没有按照你想要的方式保持排序。想一想modGroupBy如何运作,你就会明白为什么。

我建议您使用Jon Skeet's answer,因为它保留了您的收藏顺序(我冒昧地将它翻译成VB.Net)。

您必须事先计算每个分区的大小,因为它不会将集合拆分为n块,而是分成长度为n的块:

<Extension> _
Public Shared Iterator Function Partition(Of T)(source As IEnumerable(Of T), size As Integer) As IEnumerable(Of IEnumerable(Of T)) 
    Dim array__1 As T() = Nothing
    Dim count As Integer = 0
    For Each item As T In source
        If array__1 Is Nothing Then
            array__1 = New T(size - 1) {}
        End If
        array__1(count) = item
        count += 1
        If count = size Then
            yield New ReadOnlyCollection(Of T)(array__1)
            array__1 = Nothing
            count = 0
        End If
    Next
    If array__1 IsNot Nothing Then
        Array.Resize(array__1, count)
        yield New ReadOnlyCollection(Of T)(array__1)
    End If
End Function

使用它:

mainCol.Partition(CInt(Math.Ceiling(mainCol.Count() / 2)))

随意在新方法中隐藏Partition(CInt(Math.Ceiling(...))部分。

答案 2 :(得分:1)

效率低下的解决方案(对数据的迭代次数过多):

class Program
{
    static void Main(string[] args)
    {
        var data = Enumerable.Range(1, 10);
        var result = data.Split(2);            
    }
}

static class Extensions
{
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> col, int amount)
    {
        var chunkSize = (int)Math.Ceiling((double)col.Count() / (double)amount);

        for (var i = 0; i < amount; ++i)
            yield return col.Skip(chunkSize * i).Take(chunkSize);
    }
}

修改

在VB.Net

Public Shared Iterator Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
                                                     ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))

    Dim chunkSize As Integer = CInt(Math.Ceiling(CDbl(col.Count()) / CDbl(amount)))

    For i As Integer = 0 To amount - 1
        Yield col.Skip(chunkSize * i).Take(chunkSize)
    Next

End Function