批量组数组

时间:2017-09-27 09:47:36

标签: c# arrays generics

我试图开发一个我需要按属性分组的数组的解决方案有点困难,我认为它必须非常简单但是......

我有一份必须在按状态分组的批处理中处理的供应商列表。

public class SupplierDetails
{
    public int SupplierId { get; set; }
    public string Status { get; set; }
}

供应商列表将具有不同的状态和SupplierId,供应商需要按照它们插入列表的顺序进行处理,因此,我需要向流程发送具有相同状态的第一个供应商的数组,一旦处理完,我必须向该流程发送具有相同状态的下一个供应商,等等......

列表的样本

[SupplierId=1,Status='C'],[SupplierId=2,Status='C'],[SupplierId=3,Status='A'],[SupplierId=4,Status='C']

结果列表可以让我以这种方式将供应商发送到BatchProcess

BatchProcess([SupplierId=1,Status='C'],[SupplierId=2,Status='C'])
BatchProcess([SupplierId=3,Status='A'])
BatchProcess([SupplierId=4,Status='C'])

另一种解释是......“所以当你有AAABBBABABA时,你会想要一组中的前三个,然后组中的下三个然后是一堆单个项目”感谢@Chris

目前,我开发的解决方案不是很漂亮,但使用泛型

  public List<IEnumerable<T>> FlattenBatchedArray<T, Y>(IEnumerable<T> arr, Func<T, Y> getPropertyValue)
    {
        if (arr == null || arr.Count() == 0 || getPropertyValue == null)
            return null;

        var listArr = arr.ToArray();
        int indexSkip = 0;
        int indexTake = 0;
        var groupedList = new List<IEnumerable<T>>();

        while (indexTake < listArr.Length)
        {
            indexSkip = indexTake;
            var currentValue = getPropertyValue(listArr.ElementAt(indexSkip));
            indexTake = Array.FindIndex(listArr, indexSkip, x => !getPropertyValue(x).Equals(currentValue));
            if (indexTake == -1)
                indexTake = listArr.Length;
            var next = listArr.Skip(indexSkip).Take(indexTake - indexSkip);
            groupedList.Add(next.ToList());
        }
        return groupedList;
    }

这个功能通过我需要的所有测试

[TestMethod]
public void TestMethod3()
{
    var listPO = new List<SupplierDetails>();
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 1, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 2, Status = "C" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 3, Status = "C" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 4, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 5, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 6, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 7, Status = "C" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 8, Status = "O" });

    var result = FlattenBatchedArray(listPO, (x) => x.Status);
    Assert.AreEqual(result.Count, 5);
    Assert.AreEqual(result[0].Count(), 1);
    Assert.AreEqual(result[1].Count(), 2);
    Assert.AreEqual(result[2].Count(), 3);
    Assert.AreEqual(result[3].Count(), 1);
    Assert.AreEqual(result[4].Count(), 1);
    Assert.AreEqual(result[0].ElementAt(0).SupplierId, 1);
    Assert.AreEqual(result[0].ElementAt(0).Status, "O");
    Assert.AreEqual(result[1].ElementAt(0).SupplierId, 2);
    Assert.AreEqual(result[1].ElementAt(0).Status, "C");
    Assert.AreEqual(result[1].ElementAt(1).SupplierId, 3);
    Assert.AreEqual(result[1].ElementAt(1).Status, "C");
    Assert.AreEqual(result[2].ElementAt(0).SupplierId, 4);
    Assert.AreEqual(result[2].ElementAt(0).Status, "O");
    Assert.AreEqual(result[2].ElementAt(1).SupplierId, 5);
    Assert.AreEqual(result[2].ElementAt(1).Status, "O");
    Assert.AreEqual(result[2].ElementAt(2).SupplierId, 6);
    Assert.AreEqual(result[2].ElementAt(2).Status, "O");
    Assert.AreEqual(result[3].ElementAt(0).SupplierId, 7);
    Assert.AreEqual(result[3].ElementAt(0).Status, "C");
    Assert.AreEqual(result[4].ElementAt(0).SupplierId, 8);
    Assert.AreEqual(result[4].ElementAt(0).Status, "O");

    listPO = new List<SupplierDetails>();
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 1, Status = "O" });
    result = FlattenBatchedArray(listPO, (x) => x.Status);
    Assert.AreEqual(result.Count, 1);

    listPO = new List<SupplierDetails>();
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 1, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 2, Status = "O" });
    result = FlattenBatchedArray(listPO, (x) => x.Status);
    Assert.AreEqual(result.Count, 1);
    Assert.AreEqual(result[0].Count(), 2);

    listPO = new List<SupplierDetails>();
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 1, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 2, Status = "C" });
    result = FlattenBatchedArray(listPO, (x) => x.Status);
    Assert.AreEqual(result.Count, 2);
    Assert.AreEqual(result[0].Count(), 1);
    Assert.AreEqual(result[1].Count(), 1);
    Assert.AreEqual(result[0].ElementAt(0).SupplierId, 1);
    Assert.AreEqual(result[0].ElementAt(0).Status, "O");
    Assert.AreEqual(result[1].ElementAt(0).SupplierId, 2);
    Assert.AreEqual(result[1].ElementAt(0).Status, "C");

    listPO = new List<SupplierDetails>();
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 1, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 2, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 3, Status = "C" });
    result = FlattenBatchedArray(listPO, (x) => x.Status);
    Assert.AreEqual(result.Count, 2);
    Assert.AreEqual(result[0].Count(), 2);
    Assert.AreEqual(result[1].Count(), 1);
    Assert.AreEqual(result[0].ElementAt(0).SupplierId, 1);
    Assert.AreEqual(result[0].ElementAt(0).Status, "O");
    Assert.AreEqual(result[0].ElementAt(1).SupplierId, 2);
    Assert.AreEqual(result[0].ElementAt(1).Status, "O");
    Assert.AreEqual(result[1].ElementAt(0).SupplierId, 3);
    Assert.AreEqual(result[1].ElementAt(0).Status, "C");

    listPO = new List<SupplierDetails>();
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 1, Status = "O" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 2, Status = "C" });
    listPO.Add(new UnitTestProject2.SupplierDetails() { SupplierId = 3, Status = "C" });
    result = FlattenBatchedArray(listPO, (x) => x.Status);
    Assert.AreEqual(result.Count, 2);
    Assert.AreEqual(result[0].Count(), 1);
    Assert.AreEqual(result[1].Count(), 2);
    Assert.AreEqual(result[0].ElementAt(0).SupplierId, 1);
    Assert.AreEqual(result[0].ElementAt(0).Status, "O");
    Assert.AreEqual(result[1].ElementAt(0).SupplierId, 2);
    Assert.AreEqual(result[1].ElementAt(0).Status, "C");
    Assert.AreEqual(result[1].ElementAt(1).SupplierId, 3);
    Assert.AreEqual(result[1].ElementAt(1).Status, "C");
}

2 个答案:

答案 0 :(得分:3)

这是一个不会多次枚举输入序列的解决方案:

public static IEnumerable<IEnumerable<TSource>> Batch<TKey, TSource>(IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    var buffer = new List<TSource>();

    foreach (var element in source)
    {
        if ((buffer.Count > 0) && !Equals(keySelector(element), keySelector(buffer.Last())))
        {
            yield return buffer.ToArray();
            buffer.Clear();
        }

        buffer.Add(element);
    }

    if (buffer.Count > 0)
        yield return buffer.ToArray();
}

(这也处理其中一个键为空的情况,尽管这可能不是必需的。)

[编辑:我在意识到我不需要prev变量后对其进行了简化,因为我可以使用buffer.Last()。]

答案 1 :(得分:2)

对于更通用的解决方案,您可以编写一些扩展方法,如:

public static IEnumerable<IEnumerable<TSource>> SequentialGroupBy<TSource, TSelector>(this IEnumerable<TSource> source, Func<TSource, TSelector> selector)
{
    if (!source.Any())
        return Enumerable.Empty<IEnumerable<TSource>>();

    var result = new List<List<TSource>>();
    result.Add(new List<TSource> { source.ElementAt(0) });

    foreach (var item in source.Skip(1))
    {
        if (selector(result.Last()[0]).Equals(selector(item)))
            result.Last().Add(item);
        else
            result.Add(new List<TSource> { item });
    }
    return result;
}

并像这样使用它:

var result = collection.SequentialGroupBy(item => item.Status).ToList();

请注意,这仍然会占用整个集合

将从linq的deffered执行中受益的扩展方法的纯linq版本将是:

public static IEnumerable<IEnumerable<TSource>> SequentialGroupBy<TSource, TSelector>(this IEnumerable<TSource> source, Func<TSource, TSelector> selector)
{
    if (!source.Any())
        return Enumerable.Empty<IEnumerable<TSource>>();

    var prevSelector = selector(source.First());
    int groupingIndex = 0;

    return source.Select(item =>
    {
        var currentSelector = selector(item);
        if (!prevSelector.Equals(currentSelector))
        {
            prevSelector = selector(item);
            groupingIndex++;
        }
        return new { groupingIndex, currentSelector, item };
    }).GroupBy(item => new { item.groupingIndex, item.currentSelector }, select => select.item);
}