此方案基于another question中的架构,我对任何有关架构有效性的讨论都不感兴趣!
我很想知道SQL Server中是否有任何好的技术可以根据另一列(amount1
)的不同值执行一列(下面id1
)的聚合。 / p>
下面的Plan1扫描table1两次,通过p_id执行两次聚合,然后将结果连接在一起。似乎可以改进。在某些情况下,查询2可能会返回错误的结果,无论如何计划都会更糟!
有什么想法吗?
IF OBJECT_ID('tempdb..#table1') IS NOT NULL DROP TABLE #table1;
IF OBJECT_ID('tempdb..#table2') IS NOT NULL DROP TABLE #table2;
CREATE TABLE #table1 (id1 int primary key nonclustered, amount1 int, p_id int);
CREATE CLUSTERED INDEX ix ON #table1 (p_id,id1);
INSERT INTO #table1
SELECT 1,500,10 UNION ALL
SELECT 2,700,20 UNION ALL
SELECT 3,500,10 UNION ALL
SELECT 4,450,20 UNION ALL
SELECT 5,300,10;
CREATE TABLE #table2 (id2 int primary key, amount2 int, id1 int);
INSERT INTO #table2
SELECT 1,300,1 UNION ALL
SELECT 2,200,1 UNION ALL
SELECT 3,200,2 UNION ALL
SELECT 4,500,2 UNION ALL
SELECT 5,400,3 UNION ALL
SELECT 6,150,4 UNION ALL
SELECT 7,300,4 UNION ALL
SELECT 8,300,5;
WITH t1
AS (SELECT p_id,SUM(amount1) AS total1
FROM #table1
GROUP BY p_id),
t2
AS (SELECT p_id,SUM(amount2) AS total2
FROM #table2 table2
JOIN #table1 table1
ON table1.id1 = table2.id1
GROUP BY p_id)
SELECT t1.p_id,total1,total2
FROM t1
JOIN t2
ON t1.p_id = t2.p_id
SELECT table1.p_id,
FLOOR(SUM(DISTINCT amount1 + table1.id1/100000000.0)) AS total1,
SUM(amount2) AS total2
FROM #table1 table1 JOIN #table2 table2 ON table1.id1=table2.id1
GROUP BY table1.p_id
答案 0 :(得分:2)
这个只扫描一个表中的每个记录一次:
SELECT p_id, SUM(amount1) AS total1, SUM(s_amount2) AS total2
FROM #table1 t1
CROSS APPLY
(
SELECT SUM(amount2) AS s_amount2
FROM #table2 t2
WHERE t2.id1 = t1.id1
) t2
GROUP BY
p_id
|--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1026]=(0) THEN NULL ELSE [Expr1027] END, [Expr1007]=CASE WHEN [Expr1028]=(0) THEN NULL ELSE [Expr1029] END))
|--Stream Aggregate(GROUP BY:([t1].[p_id]) DEFINE:([Expr1026]=COUNT_BIG([tempdb].[dbo].[#table1].[amount1] as [t1].[amount1]), [Expr1027]=SUM([tempdb].[dbo].[#table1].[amount1] as [t1].[amount1]), [Expr1028]=COUNT_BIG([Expr1005]), [Expr1029]=SUM([Expr1005])))
|--Nested Loops(Left Outer Join, OUTER REFERENCES:([t1].[id1]))
|--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#table1] AS [t1]), ORDERED FORWARD)
|--Compute Scalar(DEFINE:([Expr1005]=CASE WHEN [Expr1024]=(0) THEN NULL ELSE [Expr1025] END))
|--Stream Aggregate(DEFINE:([Expr1024]=COUNT_BIG([tempdb].[dbo].[#table2].[amount2] as [t2].[amount2]), [Expr1025]=SUM([tempdb].[dbo].[#table2].[amount2] as [t2].[amount2])))
|--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#table2] AS [t2]), WHERE:([tempdb].[dbo].[#table2].[id1] as [t2].[id1]=[tempdb].[dbo].[#table1].[id1] as [t1].[id1]))
,虽然它不一定更有效率。
这一个:
SELECT p_id, SUM(amount1) AS total1, SUM(s_amount2) AS total2
FROM #table1 t1
JOIN (
SELECT id1, SUM(amount2) AS s_amount2
FROM #table2
GROUP BY
id1
) t2
ON t2.id1 = t1.id1
GROUP BY
p_id
将为联接提供更多选项,但是,如果将选择t2
,则可在计划中使用额外的假脱机。
答案 1 :(得分:2)
嗯,@ Quassnoi解决方案似乎相当不错。在任何情况下,对于SQL Server 2005+,您可以使用PARTITION BY
子句尝试进行更简单的查询,但执行计划并不是更好,尽管它并不一定意味着效率更高或更低。 / p>
SELECT A.p_id, MIN(amount1) total1, SUM(amount2) total2
FROM (SELECT p_id, id1, SUM(amount1) OVER(PARTITION BY p_id) amount1 FROM #table1) A
JOIN #table2 B
ON A.id1 = B.id1
GROUP BY A.p_id