按窗口分组查询和分区依据子句

时间:2018-01-18 15:47:51

标签: sql sql-server sql-server-2012 gaps-and-islands partition-by

我有以下代码

declare @test table (id int, [Status] int, [Date] date)

insert into @test (Id,[Status],[Date]) VALUES
    (1,1,'2018-01-01'),
    (2,1,'2018-01-01'),
    (1,1,'2017-11-01'),
    (1,2,'2017-10-01'),
    (1,1,'2017-09-01'),
    (2,2,'2017-01-01'),
    (1,1,'2017-08-01'),
    (1,1,'2017-07-01'),
    (1,1,'2017-06-01'),
    (1,2,'2017-05-01'),
    (1,1,'2017-04-01'),
    (1,1,'2017-03-01'),
    (1,1,'2017-01-01')

SELECT
    id,
    [Status],
MIN([Date]) OVER (PARTITION BY id,[Status] ORDER BY [Date],id,[Status] ) as WindowStart,
max([Date]) OVER (PARTITION BY id,[Status] ORDER BY [Date],id,[Status]) as WindowEnd,
COUNT(*) OVER (PARTITION BY id,[Status] ORDER BY [Date],id,[Status] ) as total
from @test

但结果如下:

id  Status  WindowStart WindowEnd   total
1   1   2017-01-01  2017-01-01  1
1   1   2017-01-01  2017-03-01  2
1   1   2017-01-01  2017-04-01  3
1   1   2017-01-01  2017-06-01  4
1   1   2017-01-01  2017-07-01  5
1   1   2017-01-01  2017-08-01  6
1   1   2017-01-01  2017-09-01  7
1   1   2017-01-01  2017-11-01  8
1   1   2017-01-01  2018-01-01  9
1   2   2017-05-01  2017-05-01  1
1   2   2017-05-01  2017-10-01  2
2   1   2018-01-01  2018-01-01  1
2   2   2017-01-01  2017-01-01  1

我需要像这样按窗口分组。

id  Status  WindowStart WindowEnd   total
1   1   2017-01-01  2017-04-01  3
1   2   2017-05-01  2017-05-01  1
1   1   2017-06-01  2017-09-01  4
1   2   2017-10-01  2017-10-01  1
1   1   2017-11-01  2018-01-01  2
2   1   2018-01-01  2018-01-01  1
2   2   2017-01-01  2017-01-01  1

id = 1 Status = 1的第一组应该在Status = 2(2017-05-01)的第一行结束,所以总数为3,然后从2017-06-01到2017年再次开始 - 09-01共有4行。

如何完成这项工作?

2 个答案:

答案 0 :(得分:0)

这是一个“经典”群组和岛屿问题。互联网上可能有1000个答案。

这适用于您所追求的目标,但是,请先尝试进行更多的研究。 :)

WITH Groups AS(
    SELECT t.*,
           ROW_NUMBER() OVER (PARTITION BY id ORDER BY [Date]) - 
           ROW_NUMBER() OVER (PARTITION BY id, [status] ORDER BY [Date]) AS Grp
    FROM @test t)
SELECT G.id,
       G.[Status],
       MIN([Date]) AS WindowStart,
       MAX([date]) AS WindowsEnd,
       COUNT(*) AS Total
FROM Groups G
GROUP BY G.id,
         G.[Status],
         G.Grp
ORDER BY G.id, WindowStart;

请注意,此解决方案中最后两行的顺序是相反的;您的预期结果似乎是您为ID 1订购ASCENDING,为{2}订购DESCENDING

答案 1 :(得分:0)

以下是使用LAG函数

的一种方法
;WITH cte
     AS (SELECT *,
                grp = Sum(CASE WHEN prev_val = Status THEN 0 ELSE 1 END)
                        OVER(partition BY id ORDER BY Date)
         FROM   (SELECT *,
                        prev_val = Lag(Status)OVER(partition BY id ORDER BY Date)
                 FROM   @test) a)
SELECT id,
       Status,
       WindowStart = Min(date),
       WindowEnd = Max(date),
       Total = Count(*)
FROM   cte
GROUP  BY id, Status, grp 

使用lag函数首先查找每个日期的先前状态,然后使用Sum over()仅在状态发生变化时通过递增数字来创建组。