根据连续时间值对行进行分组

时间:2021-01-14 08:30:10

标签: sql sql-server

我正在尝试使用在同一天内具有连续时间值的行创建一个单独的组。

比如我现在的数据集如下:

Date          StartTime     EndTime       StudentID     Type     Class     Work    Group *
2020-01-30     09:00:00     11:00:00         20789        A        178       56     1 
2020-01-30     11:00:00     13:00:00         20789        A        789       67     1
2021-01-08     09:00:00     10:00:00         78945        D        195       13     2
2021-01-08     10:00:00     12:00:00         78945        D        789       12     2
2021-01-08     13:00:00     14:00:00         78945        D        398       13     3
2021-01-08     14:00:00     16:00:00         78945        D        543       13     3

如果行具有相同的 Student IDTypeStart/End Time 在同一天内是连续的, 我想分配与数据集中的“组”列相同的唯一组 ID 号。我尝试使用 Lag/LeadPartition by 创建组列,但我当前的代码不起作用。

有人能帮我解决这个问题吗?

谢谢。

1 个答案:

答案 0 :(得分:0)

您可以使用窗口函数:

表格:

SELECT *
INTO Data
FROM (VALUES
   (CONVERT(date, '2020-01-30'), CONVERT(time, '09:00:00'), CONVERT(time, '11:00:00'), 20789, 'A', 178, 56), 
   (CONVERT(date, '2020-01-30'), CONVERT(time, '11:00:00'), CONVERT(time, '13:00:00'), 20789, 'A', 789, 67),
   (CONVERT(date, '2021-01-08'), CONVERT(time, '09:00:00'), CONVERT(time, '10:00:00'), 78945, 'D', 195, 13),
   (CONVERT(date, '2021-01-08'), CONVERT(time, '10:00:00'), CONVERT(time, '12:00:00'), 78945, 'D', 789, 12),
   (CONVERT(date, '2021-01-08'), CONVERT(time, '13:00:00'), CONVERT(time, '14:00:00'), 78945, 'D', 398, 13),
   (CONVERT(date, '2021-01-08'), CONVERT(time, '14:00:00'), CONVERT(time, '16:00:00'), 78945, 'D', 543, 13)
) v (Date, StartTime, EndTime, StudentID, Type, Class, Work)

声明:

SELECT
   Date, StartTime, EndTime, StudentID, Type, Class, Work,
   SUM(Change) OVER (ORDER BY Date, StartTime) AS GroupID
FROM (
   SELECT 
      Date, StartTime, EndTime, StudentID, Type, Class, Work,
      CASE 
         WHEN 
           StudentID = LAG(StudentID) OVER (ORDER BY Date, StudentID, StartTime) AND
           StartTime = LAG(EndTime) OVER (ORDER BY Date, StudentID, StartTime) THEN 0
         ELSE 1
      END AS Change   
   FROM Data
) t   

结果:

Date        StartTime           EndTime         StudentID   Type    Class   Work GroupID
2020-01-30  09:00:00.0000000    11:00:00.0000000    20789   A       178     56        1
2020-01-30  11:00:00.0000000    13:00:00.0000000    20789   A       789     67        1
2021-01-08  09:00:00.0000000    10:00:00.0000000    78945   D       195     13        2
2021-01-08  10:00:00.0000000    12:00:00.0000000    78945   D       789     12        2
2021-01-08  13:00:00.0000000    14:00:00.0000000    78945   D       398     13        3
2021-01-08  14:00:00.0000000    16:00:00.0000000    78945   D       543     13        3