执行者在LINQ中运行SUM OVER分区

时间:2019-05-15 16:48:20

标签: c# sql-server linq

我正在尝试找出使用LINQ使用自连接集合计算运行总和分区的最佳方法。

以下查询是我所追求的一个简单示例。输出为RowNumber,RowType和当前行的RowType中所有先前RowValue的总和。

DECLARE @T TABLE (RowNumber INT, RowType INT, RowValue INT) 

INSERT @T VALUES (1,1,1),(2,1,1),(3,1,1),(4,1,1),(5,1,1),(6,2,1),(7,2,1),(8,2,1),(9,2,1),(10,2,1) 

;WITH Data AS(SELECT RowNumber, RowType,RowValue FROM @T)

SELECT
    This.RowNumber,
    This.RowType,
    RunningValue = COALESCE(This.RowValue + SUM(Prior.RowValue),This.RowValue)
FROM
    Data This
    LEFT OUTER JOIN Data Prior ON Prior.RowNumber <  This.RowNumber AND Prior.RowType = This.RowType
GROUP BY
    This.RowNumber,
    This.RowType,
    This.RowValue
/* OR
SELECT
    This.RowNumber,
    This.RowType,
    RunningValue = SUM(RowValue) OVER (PARTITION BY RowType ORDER BY RowNUmber)
FROM
    Data This
*/

现在,我无法正常工作。

var joinedWithPreviousSums = allRows.Join(
    allRows,
    previousRows => new {previousRows.RowNumber, previousRows.RowType, previousRows.RowValue}, 
    row=> new { row.RowNumber, row.RowType, row.RowValue}, 
    (previousRows, row) => new { row.RowNumber, row.RowType, row.RowValue })
    .Where(previousRows.RowType == row.RowType && previousRows.RowNumber < row.RowNumber)
    .Select(row.RowNumber, row.RowType,RunningValue = Sum(previousRows.Value) + row.RowValue)).ToList()

当然,上面的最后两行是垃圾,试图举例说明我想要的投影,同时暗示我对高性能LINQ投影缺乏了解。

我已经阅读了以下陈述的某些变体在哪些地方可行并且可能可行,但是,有没有办法在不屈服的情况下达到类似的结果?

int s = 0;
var subgroup  = people.OrderBy(x => x.Amount)
                      .TakeWhile(x => (s += x.Amount) < 1000)
                      .ToList(); 

编辑:我已经能够获取下面的代码片段,但是,我似乎无法在RowType上进行分区或投影。

namespace ConsoleApplication1
{
    class Program
    {
        delegate string CreateGroupingDelegate(int i);

        static void Main(string[] args)
        {
            List <TestClass> list = new List<TestClass>() 
               {
                    new TestClass(1, 1, 1),
                    new TestClass(2, 2, 5), 
                    new TestClass(3, 1, 1 ),
                    new TestClass(4, 2, 5),
                    new TestClass(5, 1, 1),
                    new TestClass(6, 2, 5)
            };
            int running_total = 0;

            var result_set = list.Select(x => new { x.RowNumber, x.RowType, running_total = (running_total = running_total + x.RowValue) }).ToList();


            foreach (var v in result_set)
            {
                Console.WriteLine("list element: {0}, total so far: {1}",
                    v.RowNumber,
                    v.running_total);
            }

            Console.ReadLine();
        }
    }

    public class TestClass
    {
        public TestClass(int rowNumber, int rowType, int rowValue)
        {
            RowNumber = rowNumber;
            RowType = rowType;
            RowValue = rowValue;
        }

        public int RowNumber { get; set; }
        public int RowType { get; set; }
        public int RowValue { get; set; }
    }

}

2 个答案:

答案 0 :(得分:2)

您的答案可以大大简化,但即便如此,扩展性也很差,因为它必须经过每一行的Where才能计算出每一行,所以O(list.Count^2)。

这是更简单的版本,它保留了原始顺序:

var result = list.Select(item => new {
    RowType = item.RowType,
    RowValue = list.Where(prior => prior.RowNumber <= item.RowNumber && prior.RowType == item.RowType).Sum(prior => prior.RowValue)
});

如果愿意排序,您可以通过list一次。 (如果您知道顺序正确,或者可以使用更简单的排序,则可以删除或替换OrderBy / ThenBy。)

var ans = list.OrderBy(x => x.RowType)
              .ThenBy(x => x.RowNumber)
              .Scan(first => new { first.RowType, first.RowValue },
                    (res, cur) => res.RowType == cur.RowType ? new { res.RowType, RowValue = res.RowValue + cur.RowValue }
                                                             : new { cur.RowType, cur.RowValue }
              );

此答案使用类似于Aggregate的扩展方法,但基于APL扫描运算符返回中间结果:

// TRes seedFn(T FirstValue)
// TRes combineFn(TRes PrevResult, T CurValue)
public static IEnumerable<TRes> Scan<T, TRes>(this IEnumerable<T> src, Func<T, TRes> seedFn, Func<TRes, T, TRes> combineFn) {
    using (var srce = src.GetEnumerator()) {
        if (srce.MoveNext()) {
            var prev = seedFn(srce.Current);

            while (srce.MoveNext()) {
                yield return prev;
                prev = combineFn(prev, srce.Current);
            }
            yield return prev;
        }
    }
}

答案 1 :(得分:0)

看到这个,我的眼睛被蒙住了。经过6个小时的头骨攻击后,我这个漫长的问题的答案似乎就这么简单。感谢@NetMage指出我所缺少的SelectMany。

onTimer