如何根据时间段创建细分

时间:2019-07-11 01:28:23

标签: sql sql-server tsql sql-server-2016

我有2种活动类型(任务,聊天)。每种活动类型都有一个优先级,最高优先级为10。

我有每个活动的StartTime和EndTime。

我需要确定代理在任何给定时间正在积极从事的活动。如果两个活动重叠,则优先级最高的活动被视为代理正在处理的活动。

我解决该问题的最初想法是根据StartTime和EndTime以及活动重叠的时间将活动时间划分为多个段。然后针对每个细分以最高优先级进行活动。

例如:09:00-09:15(通话),09:15-09:30(通话),09:30-09:45(任务)

但是,我真的不知道如何在SQL中实现这一目标。 我已经对“空白和离岛”进行了一些研究,但没有找到可以转化为解决此问题的示例。

我想知道是否有人可以给我建议,以解决该问题的最佳方法和可能的解决方案,以便我能够理解并实施该解决方案?

我的数据

DECLARE @ActivityLog TABLE (Activity VARCHAR(4), Priority INT, StartTime DATETIME, EndTime DATETIME)

INSERT INTO @ActivityLog VALUES ('CHAT', 10, '2019/07/01 09:00', '2019/07/01 09:30') INSERT INTO @ActivityLog VALUES ('TASK', 5, '2019/07/01 09:15', '2019/07/01 09:45')

SELECT * FROM @ActivityLog

预期的最终产量

+------------------+------------------+----------+
|     StartTime    |      EndTime     | Activity |
+------------------+------------------+----------+
| 2019/07/01 09:00 | 2019/07/01 09:15 |   CHAT   |
+------------------+------------------+----------+
| 2019/07/01 09:15 | 2019/07/01 09:30 |   CHAT   |
+------------------+------------------+----------+
| 2019/07/01 09:30 | 2019/07/01 09:45 |   TASK   |
+------------------+------------------+----------+

方案2 任务在CHAT之前和之后都有一个活动段。

DECLARE @ActivityLog TABLE (Activity VARCHAR(8), Priority INT, StartTime DATETIME, EndTime DATETIME)

INSERT INTO @ActivityLog VALUES ('TASK', 7, '2019/07/01 09:00', '2019/07/01 10:00')
INSERT INTO @ActivityLog VALUES ('CHAT', 10, '2019/07/01 09:15', '2019/07/01 09:45');

方案2的预期输出

+------------------+------------------+----------+
|     StartTime    |      EndTime     | Activity |
+------------------+------------------+----------+
| 2019/07/01 09:00 | 2019/07/01 09:15 |   TASK   |
+------------------+------------------+----------+
| 2019/07/01 09:15 | 2019/07/01 09:45 |   CHAT   |
+------------------+------------------+----------+
| 2019/07/01 09:45 | 2019/07/01 10:00 |   TASK   |
+------------------+------------------+----------+

1 个答案:

答案 0 :(得分:1)

当我们有两个重叠的时间段(如果您需要更多的时间段,则需要更改代码以使用递归公用表表达式)时,以下方法起作用。

想法是使用LEFT JOIN进行重叠,然后根据项目的优先级应用不同的逻辑。

这很简单,但是很丑。我添加了更多的案例和测试,希望不要错过任何内容。

DECLARE @ActivityLog TABLE (Activity VARCHAR(4), Priority INT, StartTime DATETIME, EndTime DATETIME)

INSERT INTO @ActivityLog 
VALUES ('CHAT', 10, '2019/07/01 09:00', '2019/07/01 09:30') 
      ,('TASK', 5, '2019/07/01 09:15', '2019/07/01 09:20')
      --
      ,('CHAT', 8, '2019/07/01 19:30', '2019/07/01 20:30')
      --
      ,('TASK', 7, '2019/07/02 09:00', '2019/07/02 10:00')
      ,('CHAT', 10, '2019/07/02 09:15', '2019/07/02 09:30')
      --
      ,('CHAT', 10, '2019/12/01 09:00', '2019/12/01 09:30') 
      ,('TASK', 5, '2019/12/01 09:15', '2019/12/01 09:20');


WITH DataSourceWithRowID AS
(
    SELECT *
         ,ROW_NUMBER() OVER (ORDER BY [StartTime]) AS [row_id]
    FROM @ActivityLog
),
DataSource AS
(
    SELECT S.[Activity] AS [S_Activity], S.[Priority] AS [S_Priority], S.[StartTime] AS [S_StartTime], S.[EndTime] AS [S_EndTime], S.[row_id] AS [S_row_id]
          ,E.[Activity] AS [E_Activity], E.[Priority] AS [E_Priority], E.[StartTime] AS [E_StartTime], E.[EndTime] AS [E_EndTime], E.[row_id] AS [E_row_id]
    FROM DataSourceWithRowID S
    LEFT JOIN DataSourceWithRowID E
        ON E.[StartTime] > S.[StartTime] 
        AND E.[StartTime] < S.[EndTime]
)
-- case 1 - handle 1
SELECT [S_Activity], [S_Priority], [S_StartTime], [E_StartTime]
FROM DataSource
WHERE S_Priority > E_Priority
    AND S_EndTime < E_EndTime

UNION ALL

-- case 1 - handle 2
SELECT [S_Activity], [S_Priority], [E_StartTime], [S_EndTime]
FROM DataSource
WHERE S_Priority > E_Priority
    AND S_EndTime < E_EndTime

UNION ALL

-- case 1 - handle 3
SELECT [E_Activity], [E_Priority], [S_EndTime], [E_EndTime]
FROM DataSource
WHERE S_Priority > E_Priority
    AND S_EndTime < E_EndTime

UNION ALL

-- case 1 - handle 4 -- sub period with low priorty -> consume by parent
SELECT [S_Activity], [S_Priority], [S_StartTime], [S_EndTime]
FROM DataSource
WHERE S_Priority > E_Priority
    AND S_EndTime > E_EndTime

UNION ALL

-- case 2 - no overlapping
SELECT [S_Activity], [S_Priority], [S_StartTime], [S_StartTime]
FROM DataSource
WHERE [S_row_id] NOT IN (SELECT [E_row_id] FROM DataSource WHERE [E_row_id] IS NOT NULL)
    AND [E_row_id] IS NULL

UNION ALL

-- case 3 - handle 1
SELECT S_Activity, S_Priority, S_StartTime, E_StartTime
FROM DataSource 
WHERE S_Priority < E_Priority

UNION ALL

-- case 3 - handle 2
SELECT  E_Activity
       ,E_Priority
       ,E_StartTime
       ,IIF(S_EndTime > E_EndTime,  E_EndTime, S_EndTime)
FROM DataSource 
WHERE S_Priority < E_Priority

UNION ALL

-- case 3 - handle 3
SELECT IIF(S_EndTime > E_EndTime,  S_Activity, E_Activity)
      ,IIF(S_EndTime > E_EndTime,  S_Priority, E_Priority)
      ,IIF(S_EndTime > E_EndTime,  E_EndTime, S_EndTime)
      ,IIF(S_EndTime > E_EndTime,  S_EndTime, E_EndTime)       
FROM DataSource 
WHERE S_Priority < E_Priority