LAG和LEAD功能

时间:2012-10-18 11:09:15

标签: sql-server-2012

在SQL Server 2012中使用新的LAG和LEAD功能有什么好处?
是否更容易编写更简单的调试查询或是否还有性能改进?

这对我很重要,因为我们经常需要这种类型的功能,我需要知道是否应该在不久的将来推荐升级。
如果它只是更简单的查询,那么升级的麻烦(和成本)就不值得了。

2 个答案:

答案 0 :(得分:4)

为了证明执行计划的不同,我使用了Dave的SQL Authority博客的成功解决方案:

;WITH T1
AS (SELECT row_number() OVER (ORDER BY SalesOrderDetailID) N
         , s.SalesOrderID
         , s.SalesOrderDetailID
    FROM
        TempDB.dbo.LAG s
    WHERE
        SalesOrderID IN (20120303, 20120515, 20120824, 20121031))
SELECT SalesOrderID
     , SalesOrderDetailID AS CurrentSalesOrderDetailID
/*   , CASE
           WHEN N % 2 = 1 THEN
               max(CASE
                   WHEN N % 2 = 0 THEN
                       SalesOrderDetailID
               END) OVER (PARTITION BY (N + 1) / 2)
           ELSE
               max(CASE
                   WHEN N % 2 = 1 THEN
                       SalesOrderDetailID
               END) OVER (PARTITION BY N / 2)
       END LeadVal */
     , CASE
           WHEN N % 2 = 1 THEN
               max(CASE
                   WHEN N % 2 = 0 THEN
                       SalesOrderDetailID
               END) OVER (PARTITION BY N / 2)
           ELSE
               max(CASE
                   WHEN N % 2 = 1 THEN
                       SalesOrderDetailID
               END) OVER (PARTITION BY (N + 1) / 2)
       END PreviousSalesOrderDetailID
FROM
    T1
ORDER BY
    SalesOrderID
  , SalesOrderDetailID;



SELECT SalesOrderID
     , SalesOrderDetailID AS CurrentSalesOrderDetailID
     , LAG(SalesOrderDetailID, 1, 0) OVER (ORDER BY SalesOrderID, SalesOrderDetailID) AS PreviousSalesOrderDetailID
FROM TempDB.dbo.LAG
WHERE SalesOrderID  IN (20120303, 20120515, 20120824, 20121031);





Warning: Null value is eliminated by an aggregate or other SET operation.

(10204 row(s) affected)
Table 'Worktable'. Scan count 6, logical reads 81638, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LAG'. Scan count 4, logical reads 48, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:   CPU time = 297 ms,  elapsed time = 332 ms.

--- versus ---

(10204 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LAG'. Scan count 4, logical reads 48, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:   CPU time = 78 ms,  elapsed time = 113 ms.

除了更优雅之外,它消耗更少的资源。

以下是图形执行计划的比较:

Execution Plans Show Clear Winner in This Specific Case

执行计划在这个具体案例中显示了一个明显的赢家。 Dave的页面有许多可能的不同方法来获得LEAD / LAG功能。也许其中一些人会胜过SQL Server的内部解决方案。或者,也许不是。

答案 1 :(得分:2)

我无法对MS SQL Server 2012发表过多评论,但从PostgreSQL的角度来看,这些功能自版本8.4开始提供。

通常,他们检测更改非常方便(通常,在时间序列中,与ORDER BY一起使用)。典型地:

WITH shifted_timeseries AS (
    SELECT event_time,
           value,
           LAG(value) OVER (ORDER BY event_time) AS lagged_value
        FROM timeseries
)
SELECT event_time AS change_time, value AS new_value
FROM shifted_timeseries
    WHERE value != lagged_value;

对于这类事情,仅就清晰度而言,它们是值得的(虽然这可能是主观的)。

对于更复杂的操作,例如,如果您想要连续值的时间段,this answer是解决此问题的非常优雅的解决方案。根据{{​​3}},它似乎在SQL Server 2012中运行得很好。

这两个博客条目还显示了使用LEAD / LAG和执行相同查询之间的比较:

(比较执行计划会很有趣。)