我编写了一个SQL Query来计算事件的开始和结束时间。
结果如下所示:(我将其保存为TimeData
)
Id start end
___________________________
1 100 124
2 106 115
3 127 130
4 128 130
5 136 150
这些行按“开始”排序。
我现在要做的是折叠所有这些行以表示包含数据的时间跨度。 像这样:
start end
________________
100 124
127 130
136 150
到目前为止我所取得的成就(但可怕的错误)是这样的:
select * from
(select *,
LAG([end],1) over(order by [start]) as pe
from TimeData) as X
where X.pe < [start]
这实际上适用于某些后续行但它保留pe
中前一行的TimeData
,而我要求它来自前一个返回行(之前的行)条件为真的行。)
我希望我的问题很明确。 任何帮助表示赞赏。
答案 0 :(得分:1)
相当痛苦。一种方法是确定哪些记录开始新的间隔。你必须小心lag()
,因为重叠可能不在前面的记录上。
这是一种方法:
with t as (
select t.*,
(case when exists (select 1
from t t2
where t.start <= t2.end and t.start >= t2.start and
t2.id < t.id
)
then 0 else 1
end) as startgrp
from t
)
select grp, min(start), max(end)
from (select t.*, sum(startgrp) over (order by start) as grp
from t
) t
group by grp;
答案 1 :(得分:1)
你的问题看起来像是Itzik Ben-Gan所称的问题Packing Intervals。在他的文章中,他展示的方法应该比另一个答案中显示的自连接更有效。
有关其工作原理的详细说明,请参阅他的文章。逐步运行查询CTE-by-CTE并检查中间结果以了解其工作原理。
示例数据
DECLARE @T TABLE(ID int, starttime int, endtime int);
INSERT INTO @T VALUES
(1, 100, 124),
(2, 106, 115),
(3, 127, 130),
(4, 128, 130),
(5, 136, 150);
<强>查询强>
WITH
C1 AS
(
SELECT ID, starttime AS ts, +1 AS type, 1 AS sub
FROM @T
UNION ALL
SELECT ID, endtime AS ts, -1 AS type, 0 AS sub
FROM @T
)
,C2 AS
(
SELECT C1.*,
SUM(type)
OVER(ORDER BY ts, type DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
- sub AS cnt
FROM C1
)
,C3 AS
(
SELECT ID, ts,
(ROW_NUMBER() OVER(ORDER BY ts) - 1) / 2 + 1
AS grpnum
FROM C2
WHERE cnt = 0
)
SELECT MIN(ts) AS starttime, MAX(ts) AS endtime
FROM C3
GROUP BY grpnum;
<强>结果强>
+-----------+---------+
| starttime | endtime |
+-----------+---------+
| 100 | 124 |
| 127 | 130 |
| 136 | 150 |
+-----------+---------+