通过非连续日期范围唯一标识分组数据

时间:2012-08-20 16:16:48

标签: sql-server

请帮助,我的一个MSSQL数据表中有以下数据。

ID | StartDateTime | EndDateTime | OrderNo |
1 | 12-08-01 08:00 | 12-08-01 08:00 | 6001 |
5 | 12-08-01 09:00 | 12-08-01 10:00 | 6001 |
7 | 12-08-01 10:00 | 12-08-01 11:00 | 6001 |
10 | 12-08-01 11:00 | 12-08-01 12:00 | 6002 |
15 | 12-08-01 12:00 | 12-08-01 13:00 | 6002 |
22 | 12-08-01 13:00 | 12-08-01 14:00 | 6003 |
29 | 12-08-01 14:00 | 12-08-01 15:00 | 6001 |
33 | 12-08-01 15:00 | 12-08-01 16:00 | 6001 |
36 | 12-08-01 16:00 | 12-08-01 17:00 | 6004 |

问题是目前我无法判断OrderNo是否已被多次使用。我不能说订单6001已经运行了两次。

我希望能够添加一个新字段,以便从现在开始唯一标识订单的每次运行。但也要回顾以前的记录并更新它们。

ID | StartDateTime | EndDateTime | OrderNo | Run |
1 | 12-08-01 08:00 | 12-08-01 08:00 | 6001 | 1 |
5 | 12-08-01 09:00 | 12-08-01 10:00 | 6001 | 1 |
7 | 12-08-01 10:00 | 12-08-01 11:00 | 6001 | 1 |
10 | 12-08-01 11:00 | 12-08-01 12:00 | 6002 | 1 |
15 | 12-08-01 12:00 | 12-08-01 13:00 | 6002 | 1 |
22 | 12-08-01 13:00 | 12-08-01 14:00 | 6003 | 1 |
29 | 12-08-01 14:00 | 12-08-01 15:00 | 6001 | 2 |
33 | 12-08-01 15:00 | 12-08-01 16:00 | 6001 | 2 |
36 | 12-08-01 16:00 | 12-08-01 17:00 | 6004 | 1 |

我可以通过OrderNo和Run进行分组,并接收以下内容。

OrderNo | Run | RunStart | RunEnd |
6001 | 1 | 12-08-01 08:00 | 12-08-01 11:00 |
6001 | 2 | 12-08-01 14:00 | 12-08-01 16:00 |
6002 | 1 | 12-08-01 11:00 | 12-08-01 13:00 |
6003 | 1 | 12-08-01 13:00 | 12-08-01 14:00 |
6004 | 1 | 12-08-01 16:00 | 12-08-01 17:00 |

我尝试过使用ROW_NUMBER,CTE,游标等多种方式来运行数据。我觉得有一个简单的解决方案,但我无法弄清楚。

我希望这是有道理的。

修改

我已经更改了数据表以引用额外的复杂功能,我没有第一次包含这些复杂功能。 Aaron提供的解决方案本来可以正常工作。但它假设运行只能持续长达2小时(或行)。在我的数据库中,这些运行n小时(或行)。对不起,我第一次不清楚,我很感激到目前为止给予的帮助。

1 个答案:

答案 0 :(得分:1)

你的编辑实际上让我的问题更简单(可能只是因为我最初错过了一个更简单的岛屿方法)。

;WITH x AS 
(
  SELECT OrderNo, StartDateTime, EndDateTime,
    rn1 = ROW_NUMBER() OVER (ORDER BY StartDateTime), 
    rn = ROW_NUMBER() OVER (PARTITION BY OrderNo ORDER BY StartDateTIme)
  FROM dbo.table_name -- you need to change this
),
y AS
(
  SELECT OrderNo, Island = rn1 - rn, 
    rs = MIN(StartDateTime), 
    re = MAX(EndDateTime) 
  FROM x GROUP BY OrderNo, rn1 - rn
)
SELECT 
  OrderNo, 
  Run = ROW_NUMBER() OVER (PARTITION BY OrderNo ORDER BY rs),
  RunStart = rs, 
  RunEnd = rs
FROM y
ORDER BY OrderNo, Run;

留下我原来的答案给后人。


可能有一种更简单的方法,但这可以得到你在使用窗口函数后的答案。

;WITH x AS 
(
  SELECT ID, StartDateTime, EndDateTime, OrderNo,
    rn = ROW_NUMBER() OVER (PARTITION BY OrderNo ORDER BY StartDateTime) 
  FROM dbo.table_name -- you need to change this
), y AS
(
  SELECT x.ID, x.StartDateTime, x.EndDateTime, x.OrderNo, x.rn,
      x2ID = x2.ID, x2S = x2.StartDateTime, x2E = x2.EndDateTime, 
      x2O = x2.OrderNo, x2rn = x2.rn
  FROM x LEFT OUTER JOIN x AS x2
  ON x.OrderNo = x2.OrderNo
  AND x.rn = x2.rn - 1
  AND x.ID = x2.ID - 1
)
SELECT 
  OrderNo, 
  Run = ROW_NUMBER() OVER (PARTITION BY OrderNo ORDER BY StartDateTime),
  RunStart = StartDateTime, 
  RunEnd = COALESCE(x2E, EndDateTime) 
FROM y
WHERE x2ID IS NOT NULL 
OR NOT EXISTS 
(
  SELECT 1 FROM y AS y2 WHERE y2.OrderNo = y.OrderNo AND y2.x2rn = y.rn
)
ORDER BY OrderNo, Run;