我看了几个其他问题试图找到答案,但我不能。这就是事情,我有一个真正的大表,它将无限增长。当我说 BIG 时,我的意思是我有大约1000万行用于6小时数据的查询。我们有几个月的数据,所以你可以看到它有多大。
嗯,证明大小问题是合理的,我想做一个非常简单的查询:按列分组并将另一列的值相加。其中我想要最大的10个和,以及所有其他不在前10个的总和。我知道有这样做的方法,但我想这样做而不必计算两次总计表。为此,我使用了Table变量。我正在使用SQL SERVER 2012.
DECLARE @sumsTable TABLE(operationName varchar(200), operationAmount int)
DECLARE @topTable TABLE(operationName varchar(200), operationAmount int)
DECLARE @startTime DATETIME
DECLARE @endTime DATETIME
DECLARE @top INTEGER
SET @top = 10
SET @endTime = '03/11/2013'
SET @startTime = '03/10/2013'
--grouping by operationName and summing occurences
INSERT INTO @sumsTable
SELECT operationName, COUNT(*) AS operationAmount
FROM [f6f87bf0-33ab-4882-8674-2cb31e5e49c4]
WHERE (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime)
GROUP BY operationName
--selecting top ocurrences
INSERT INTO @topTable
SELECT TOP(@top) * FROM @sumsTable
ORDER BY operationAmount DESC
--Summing others and making union with top
SELECT 'OTHER' AS operationName, SUM(operationAmount) as operationAmount FROM @sumsTable
WHERE operationName NOT IN (SELECT operationName FROM @topTable)
UNION
SELECT * FROM @topTable
ORDER BY operationAmount DESC
我的问题是适合这是一个很好的方法,如果有更好的方法,更快的方式......我犯了什么罪?我可以摆脱表变量,而不是将所有的求和更多一次吗?
答案 0 :(得分:2)
您可以在没有临时表的情况下执行此操作:
SET @top = 10
SET @endTime = '03/11/2013'
SET @startTime = '03/10/2013'
select
(case when y.RowID > @top then 'OTHER' else y.operationName end) as operationName,
sum(y.operationAmount) as operationAmount
from
(
select
row_number() over(order by count(*) desc) as RowID,
x.operationName,
count(*) AS operationAmount
from [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] as x
where (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime)
group by x.operationName
)
as y
group by (case when y.RowID > @top then 'OTHER' else y.operationName end)
答案 1 :(得分:0)
使用以下sql,您只需要聚合原始表一次
而不是
row_number() over(order by count(*) desc) as RowID, x.operationName, count(*) AS operationAmount
会计算(*)两次
DECLARE @startTime DATETIME
DECLARE @endTime DATETIME
DECLARE @top INTEGER
SET @endTime = '03/11/2013'
SET @startTime = '03/10/2013'
;WITH cte AS ( -- get sum for all operations
SELECT operationName, COUNT(*) AS operationAmount
FROM [f6f87bf0-33ab-4882-8674-2cb31e5e49c4]
WHERE (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime)
GROUP BY operationName
),
cte1 AS ( -- rank totals
SELECT operationName, operationAmount, ROW_NUMBER()OVER (ORDER BY operationAmount DESC) AS RN
FROM cte
) -- get top 10 and others
SELECT (CASE WHEN RN < 10 THEN operationName ELSE 'Others' END) Name, SUM(operationAmount)
FROM cte1
GROUP BY (CASE WHEN RN < 10 THEN operationName ELSE 'Others' END)