在OVER子句中使用ORDER BY

时间:2019-05-06 01:04:03

标签: sql tsql window-functions

我是T-SQL和窗口函数的新手。

我不解释为什么下面两个查询产生相同的结果:

SELECT 
    empid, ordermonth, val,
   SUM(val) OVER (PARTITION BY empid ORDER BY ordermonth
                  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runval
FROM 
    Sales.EmpOrders;

SELECT 
    empid, ordermonth, val,
    SUM(val) OVER(PARTITION BY empid ORDER BY ordermonth) AS runval
FROM 
    Sales.EmpOrders;

输出相同:

enter image description here

第二个查询是否应为每个Empid产生相同的总值?还是ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW是默认值,当在over子句中使用order by时是可选的吗?

2 个答案:

答案 0 :(得分:1)

如果您希望每个empid都具有相同的值,请不要使用ORDER BY

SELECT empid, ordermonth, val,
       SUM(val) OVER (PARTITION BY empid) AS runval
FROM Sales.EmpOrders;

否则,您的两个表达式相同-如果排序键是唯一的。 documentation中说明了默认值:

  

如果未指定ROWS / RANGE但指定了ORDER BY,则RANGE   窗口的默认值是“未绑定先行和当前行”   框架。

答案 1 :(得分:1)

对于连续总和(或类似数字),当两行之间的ORDER BY ...中有平局时,则可见差异。考虑以下示例,其中员工在2006-09-01上有两个订单:

DECLARE @T TABLE (empid INT, ordermonth DATE, val INT);
INSERT INTO @T VALUES
(1, '2006-07-01', 100),
(1, '2006-08-01', 100),
(1, '2006-09-01', 100),
(1, '2006-09-01', 100),
(1, '2006-10-01', 100);

SELECT empid, ordermonth, val,
   runval_rows = SUM(val) OVER (PARTITION BY empid ORDER BY ordermonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
   runval_auto = SUM(val) OVER (PARTITION BY empid ORDER BY ordermonth)
FROM @t

empid | ordermonth | val | runval_rows | runval_auto
1     | 2006-07-01 | 100 | 100         | 100
1     | 2006-08-01 | 100 | 200         | 200
1     | 2006-09-01 | 100 | 300*        | 400*
1     | 2006-09-01 | 100 | 400*        | 400*
1     | 2006-10-01 | 100 | 500         | 500

当未指定row / range子句时,SQL Server默认为:

  

如果未指定ROWS / RANGE但指定了ORDER BY,则RANGE   窗口的默认值是“未绑定先行和当前行”   框架。

用最简单的话来说,范围是分区内在ORDER BY子句中指定的列中具有相同值的行的集合。因此,第二个变体将第3个和第4个视为相同范围的一部分,并在计算运行总和时将它们都包括在内。