我正在尝试从下面的表变量中选择前n个rowid值,这将使我接近200,000的sum(itemcount)而不会超过该阈值。如果我手动看这个,我只需要排在前3行。除非没有基于纯集的方式,否则我不想使用游标。
什么是一个好的基于集合的方法来获取所有rowid值“总和/直到”我达到运行总计200,000?
我在http://www.1keydata.com/sql/sql-running-totals.html处查看了“正在运行的总计”,但这似乎不会有效,因为真正的表有大约500k行。
这是我到目前为止所尝试的内容:
declare @agestuff table ( rowid int primary key , itemcount int , itemage datetime )
insert into @agestuff values ( 1 , 175000 , '2013-01-24 17:21:40' )
insert into @agestuff values ( 2 , 300 , '2013-01-24 17:22:11' )
insert into @agestuff values ( 3 , 10000 , '2013-01-24 17:22:11' )
insert into @agestuff values ( 4 , 19000 , '2013-01-24 17:22:19' )
insert into @agestuff values ( 5 , 16000 , '2013-01-24 17:22:22' )
insert into @agestuff values ( 6 , 400 , '2013-01-24 17:23:06' )
insert into @agestuff values ( 7 , 25000 , '2013-01-24 17:23:06' )
select sum(itemcount) from @agestuff -- 245700 which is too many
select sum(itemcount) from @agestuff
where rowid in (1,2,3) -- 185300 which gets me as close as possible
使用SQL Server 2008.如果需要,我将切换到2012。
答案 0 :(得分:11)
DECLARE @point INT = 200000;
;WITH x(rowid, ic, r, s) AS
(
SELECT
rowid, itemcount, ROW_NUMBER() OVER (ORDER BY itemage, rowid),
SUM(itemcount) OVER (ORDER BY [itemage], rowid RANGE UNBOUNDED PRECEDING)
FROM @agestuff
)
SELECT x.rowid, x.ic, x.s
FROM x WHERE x.s <= @point
ORDER BY x.rowid;
结果:
rowid ic sum
----- ------ ------
1 175000 175000
2 300 175300
3 10000 185300
如果由于某种原因无法使用SQL Server 2012,那么在SQL Server 2008上,您可以使用以下几种方法:
请注意,此行为未记录,也未保证以正确的顺序计算运行总计。所以请自担风险使用。
DECLARE @st TABLE
(
rowid INT PRIMARY KEY,
itemcount INT,
s INT
);
DECLARE @RunningTotal INT = 0;
INSERT @st(rowid, itemcount, s)
SELECT rowid, itemcount, 0
FROM @agestuff
ORDER BY rowid;
UPDATE @st
SET @RunningTotal = s = @RunningTotal + itemcount
FROM @st;
SELECT rowid, itemcount, s
FROM @st
WHERE s < @point
ORDER BY rowid;
DECLARE @st TABLE
(
rowid INT PRIMARY KEY, itemcount INT, s INT
);
DECLARE
@rowid INT, @itemcount INT, @RunningTotal INT = 0;
DECLARE c CURSOR LOCAL FAST_FORWARD
FOR SELECT rowid, itemcount
FROM @agestuff ORDER BY rowid;
OPEN c;
FETCH c INTO @rowid, @itemcount;
WHILE @@FETCH_STATUS = 0
BEGIN
SET @RunningTotal = @RunningTotal + @itemcount;
IF @RunningTotal > @point
BREAK;
INSERT @st(rowid, itemcount, s)
SELECT @rowid, @itemcount, @RunningTotal;
FETCH c INTO @rowid, @itemcount;
END
CLOSE c;
DEALLOCATE c;
SELECT rowid, itemcount, s
FROM @st
ORDER BY rowid;
我只选择了两种选择,因为其他选项更不可取(主要是从性能角度来看)。您可以在以下博客文章中看到它们,并了解它们的执行方式以及有关潜在问题的更多信息。不要把自己描绘成一个角落,因为你坚持认为游标是坏的 - 有时候,就像在这种情况下,他们可以是最有效的支持和可靠的选择:
http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals