我正在尝试找出使用LINQ使用自连接集合计算运行总和分区的最佳方法。
以下查询是我所追求的一个简单示例。输出为RowNumber,RowType和当前行的RowType中所有先前RowValue的总和。
DECLARE @T TABLE (RowNumber INT, RowType INT, RowValue INT)
INSERT @T VALUES (1,1,1),(2,1,1),(3,1,1),(4,1,1),(5,1,1),(6,2,1),(7,2,1),(8,2,1),(9,2,1),(10,2,1)
;WITH Data AS(SELECT RowNumber, RowType,RowValue FROM @T)
SELECT
This.RowNumber,
This.RowType,
RunningValue = COALESCE(This.RowValue + SUM(Prior.RowValue),This.RowValue)
FROM
Data This
LEFT OUTER JOIN Data Prior ON Prior.RowNumber < This.RowNumber AND Prior.RowType = This.RowType
GROUP BY
This.RowNumber,
This.RowType,
This.RowValue
/* OR
SELECT
This.RowNumber,
This.RowType,
RunningValue = SUM(RowValue) OVER (PARTITION BY RowType ORDER BY RowNUmber)
FROM
Data This
*/
现在,我无法正常工作。
var joinedWithPreviousSums = allRows.Join(
allRows,
previousRows => new {previousRows.RowNumber, previousRows.RowType, previousRows.RowValue},
row=> new { row.RowNumber, row.RowType, row.RowValue},
(previousRows, row) => new { row.RowNumber, row.RowType, row.RowValue })
.Where(previousRows.RowType == row.RowType && previousRows.RowNumber < row.RowNumber)
.Select(row.RowNumber, row.RowType,RunningValue = Sum(previousRows.Value) + row.RowValue)).ToList()
当然,上面的最后两行是垃圾,试图举例说明我想要的投影,同时暗示我对高性能LINQ投影缺乏了解。
我已经阅读了以下陈述的某些变体在哪些地方可行并且可能可行,但是,有没有办法在不屈服的情况下达到类似的结果?
int s = 0;
var subgroup = people.OrderBy(x => x.Amount)
.TakeWhile(x => (s += x.Amount) < 1000)
.ToList();
编辑:我已经能够获取下面的代码片段,但是,我似乎无法在RowType上进行分区或投影。
namespace ConsoleApplication1
{
class Program
{
delegate string CreateGroupingDelegate(int i);
static void Main(string[] args)
{
List <TestClass> list = new List<TestClass>()
{
new TestClass(1, 1, 1),
new TestClass(2, 2, 5),
new TestClass(3, 1, 1 ),
new TestClass(4, 2, 5),
new TestClass(5, 1, 1),
new TestClass(6, 2, 5)
};
int running_total = 0;
var result_set = list.Select(x => new { x.RowNumber, x.RowType, running_total = (running_total = running_total + x.RowValue) }).ToList();
foreach (var v in result_set)
{
Console.WriteLine("list element: {0}, total so far: {1}",
v.RowNumber,
v.running_total);
}
Console.ReadLine();
}
}
public class TestClass
{
public TestClass(int rowNumber, int rowType, int rowValue)
{
RowNumber = rowNumber;
RowType = rowType;
RowValue = rowValue;
}
public int RowNumber { get; set; }
public int RowType { get; set; }
public int RowValue { get; set; }
}
}
答案 0 :(得分:2)
您的答案可以大大简化,但即便如此,扩展性也很差,因为它必须经过每一行的Where
才能计算出每一行,所以O(list.Count^2
)。
这是更简单的版本,它保留了原始顺序:
var result = list.Select(item => new {
RowType = item.RowType,
RowValue = list.Where(prior => prior.RowNumber <= item.RowNumber && prior.RowType == item.RowType).Sum(prior => prior.RowValue)
});
如果愿意排序,您可以通过list
一次。 (如果您知道顺序正确,或者可以使用更简单的排序,则可以删除或替换OrderBy
/ ThenBy
。)
var ans = list.OrderBy(x => x.RowType)
.ThenBy(x => x.RowNumber)
.Scan(first => new { first.RowType, first.RowValue },
(res, cur) => res.RowType == cur.RowType ? new { res.RowType, RowValue = res.RowValue + cur.RowValue }
: new { cur.RowType, cur.RowValue }
);
此答案使用类似于Aggregate
的扩展方法,但基于APL扫描运算符返回中间结果:
// TRes seedFn(T FirstValue)
// TRes combineFn(TRes PrevResult, T CurValue)
public static IEnumerable<TRes> Scan<T, TRes>(this IEnumerable<T> src, Func<T, TRes> seedFn, Func<TRes, T, TRes> combineFn) {
using (var srce = src.GetEnumerator()) {
if (srce.MoveNext()) {
var prev = seedFn(srce.Current);
while (srce.MoveNext()) {
yield return prev;
prev = combineFn(prev, srce.Current);
}
yield return prev;
}
}
}
答案 1 :(得分:0)
看到这个,我的眼睛被蒙住了。经过6个小时的头骨攻击后,我这个漫长的问题的答案似乎就这么简单。感谢@NetMage指出我所缺少的SelectMany。
onTimer