基于LAG列(包间隔)跳过行

时间:2017-11-07 20:30:11

标签: sql sql-server

我编写了一个SQL Query来计算事件的开始和结束时间。 结果如下所示:(我将其保存为TimeData

Id        start        end
___________________________
1        100        124
2        106        115
3        127        130
4        128        130
5        136        150

这些行按“开始”排序。

我现在要做的是折叠所有这些行以表示包含数据的时间跨度。 像这样:

start        end
________________
100        124
127        130
136        150

到目前为止我所取得的成就(但可怕的错误)是这样的:

select * from 

(select *,
LAG([end],1) over(order by [start]) as pe
from TimeData) as X

where X.pe < [start]

这实际上适用于某些后续行但它保留pe中前一行的TimeData,而我要求它来自前一个返回行(之前的行)条件为真的行。)

我希望我的问题很明确。 任何帮助表示赞赏。

2 个答案:

答案 0 :(得分:1)

相当痛苦。一种方法是确定哪些记录开始新的间隔。你必须小心lag(),因为重叠可能不在前面的记录上。

这是一种方法:

with t as (
      select t.*,
             (case when exists (select 1
                                from t t2
                                where t.start <= t2.end and t.start >= t2.start and
                                      t2.id < t.id
                               )
                   then 0 else 1
               end) as startgrp
      from t
     )
select grp, min(start), max(end)
from (select t.*, sum(startgrp) over (order by start) as grp
      from t
     ) t
group by grp;

答案 1 :(得分:1)

你的问题看起来像是Itzik Ben-Gan所称的问题Packing Intervals。在他的文章中,他展示的方法应该比另一个答案中显示的自连接更有效。

有关其工作原理的详细说明,请参阅他的文章。逐步运行查询CTE-by-CTE并检查中间结果以了解其工作原理。

示例数据

DECLARE @T TABLE(ID int, starttime int, endtime int);

INSERT INTO @T VALUES
(1, 100, 124),
(2, 106, 115),
(3, 127, 130),
(4, 128, 130),
(5, 136, 150);

<强>查询

WITH 
C1 AS
(
    SELECT ID, starttime AS ts, +1 AS type, 1 AS sub
    FROM @T
    UNION ALL
    SELECT ID, endtime AS ts, -1 AS type, 0 AS sub
    FROM @T
)
,C2 AS
(
    SELECT C1.*,
        SUM(type) 
            OVER(ORDER BY ts, type DESC
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
        - sub AS cnt
    FROM C1
)
,C3 AS
(
    SELECT ID, ts,
        (ROW_NUMBER() OVER(ORDER BY ts) - 1) / 2 + 1
        AS grpnum
    FROM C2
    WHERE cnt = 0
)
SELECT MIN(ts) AS starttime, MAX(ts) AS endtime
FROM C3
GROUP BY grpnum;

<强>结果

+-----------+---------+
| starttime | endtime |
+-----------+---------+
|       100 |     124 |
|       127 |     130 |
|       136 |     150 |
+-----------+---------+