我在每一行都有每日日期,包括开始和结束日期,以及它们是否适用于此查询。我需要返回所有无间隔的天间隔。
示例表:
+----+---------------------+---------------------+--------+--------------------------------------+
| ID | DATE_START | DATE_END | STATUS | COMMENT |
|----+---------------------+---------------------+--------+--------------------------------------+
| 1 | 2019-01-01 00:00:00 | 2019-01-02 00:00:00 | 1 | |
|----+---------------------+---------------------+--------+--------------------------------------+
| 2 | 2019-01-02 00:00:00 | 2019-01-03 00:00:00 | 1 | |
|----+---------------------+---------------------+--------+--------------------------------------+|
| | | | | <-- did this gap visually, following |
| | | | | <-- dates are more than 1 day off |
|----+---------------------+---------------------+--------+--------------------------------------+
| 10 | 2019-02-07 06:00:00 | 2019-02-08 06:00:00 | 1 | |
|----+---------------------+---------------------+--------+--------------------------------------+
| 11 | 2019-02-08 06:00:00 | 2019-02-09 06:00:00 | 1 | |
|----+---------------------+---------------------+--------+--------------------------------------+
| 12 | 2019-02-09 06:00:00 | 2019-02-10 06:00:00 | 1 | |
|----+---------------------+---------------------+--------+--------------------------------------+
| 13 | 2019-02-10 06:00:00 | 2019-02-11 06:00:00 | 0 | <-- gap, as STATUS=0 |
|----+---------------------+---------------------+--------+--------------------------------------+
| 14 | 2019-02-11 06:00:00 | 2019-02-12 06:00:00 | 1 | |
|----+---------------------+---------------------+--------+--------------------------------------+
结果表应如下所示:
+---------------------+---------------------+----------+
| INTERVAL_START | INTERVAL_END | IDS |
+---------------------+---------------------+----------+
| 2019-01-01 00:00:00 | 2019-01-03 00:00:00 | 1,2 |
+---------------------+---------------------+----------+
| 2019-02-07 06:00:00 | 2019-02-10 06:00:00 | 10,11,12 |
+---------------------+---------------------+----------+
| 2019-02-11 06:00:00 | 2019-02-12 06:00:00 | 14 |
+---------------------+---------------------+----------+
好,选择STATUS<>0
是否正确。我所挣扎的是,我不知道该如何开始进行递归查找,如果第二天也可用,如果可以,继续直到第二天没有(并收集这些天的ID)。
由于此查询将包含大量其他数据,因此这根本不是问题。我只是无法绕过这种递归方法。
如果这是尽可能获得的标准SQL,将非常有帮助,因为将来可能会移植此查询。
编辑:哦,正如您在此处看到的时间戳记一样,DATE_START
始终与前一天的DATE_END
相同的小时/分钟(如果存在)。
答案 0 :(得分:1)
Oracle设置:
CREATE TABLE test_data ( ID, DATE_START, DATE_END, STATUS ) AS
SELECT 1, DATE '2019-01-01', DATE '2019-01-02', 1 FROM DUAL UNION ALL
SELECT 2, DATE '2019-01-02', DATE '2019-01-03', 1 FROM DUAL UNION ALL
SELECT 10, DATE '2019-01-07' + INTERVAL '6' HOUR, DATE '2019-01-08' + INTERVAL '6' HOUR, 1 FROM DUAL UNION ALL
SELECT 11, DATE '2019-01-08' + INTERVAL '6' HOUR, DATE '2019-01-09' + INTERVAL '6' HOUR, 1 FROM DUAL UNION ALL
SELECT 12, DATE '2019-01-09' + INTERVAL '6' HOUR, DATE '2019-01-10' + INTERVAL '6' HOUR, 1 FROM DUAL UNION ALL
SELECT 13, DATE '2019-01-10' + INTERVAL '6' HOUR, DATE '2019-01-11' + INTERVAL '6' HOUR, 0 FROM DUAL UNION ALL
SELECT 14, DATE '2019-01-11' + INTERVAL '6' HOUR, DATE '2019-01-12' + INTERVAL '6' HOUR, 1 FROM DUAL
查询:
SELECT MIN( date_start ) AS date_start,
MAX( date_end ) AS date_end,
LISTAGG( id, ',' ) WITHIN GROUP ( ORDER BY date_start, date_end ) AS ids
FROM (
SELECT id,
date_start,
date_end,
status,
SUM( change_group ) OVER ( ORDER BY date_start, date_end )
AS group_id
FROM (
SELECT t.*,
CASE
WHEN date_start = LAG( date_end ) OVER ( ORDER BY date_start, date_end )
AND 1 = LAG( status ) OVER ( ORDER BY date_start, date_end )
AND 1 = status
THEN 0
ELSE 1
END AS change_group
FROM test_data t
)
WHERE status = 1
)
GROUP BY group_id
输出:
DATE_START | DATE_END | IDS :------------------ | :------------------ | :------- 2019-01-01 00:00:00 | 2019-01-03 00:00:00 | 1,2 2019-01-07 06:00:00 | 2019-01-10 06:00:00 | 10,11,12 2019-01-11 06:00:00 | 2019-01-12 06:00:00 | 14
db <>提琴here
答案 1 :(得分:1)
MT0是正确的,尽管我认为count ... filter
子句比sum ... case
更容易阅读:
with t as (
select 1 as id, timestamp '2019-01-01 00:00:00' as date_start, timestamp '2019-01-02 00:00:00' as date_end, 1 as status union
select 2 as id, timestamp '2019-01-02 00:00:00' as date_start, timestamp '2019-01-03 00:00:00' as date_end, 1 as status union
select 10 as id, timestamp '2019-01-07 06:00:00' as date_start, timestamp '2019-01-08 06:00:00' as date_end, 1 as status union
select 11 as id, timestamp '2019-01-08 06:00:00' as date_start, timestamp '2019-01-09 06:00:00' as date_end, 1 as status union
select 12 as id, timestamp '2019-01-09 06:00:00' as date_start, timestamp '2019-01-10 06:00:00' as date_end, 1 as status union
select 13 as id, timestamp '2019-01-10 06:00:00' as date_start, timestamp '2019-01-11 06:00:00' as date_end, 0 as status union
select 14 as id, timestamp '2019-01-11 06:00:00' as date_start, timestamp '2019-01-12 06:00:00' as date_end, 1 as status
), t2 as (
select t.*, lag(date_end) over (order by date_start) as prev_date_end
from t
where status = 1
), t3 as (
select t2.*, count(1) filter (where date_start is distinct from prev_date_end) over (order by date_start) as g
from t2
)
select min(date_start), max(date_end), string_agg(cast(id as text),',') from t3
group by g
order by g
在https://www.db-fiddle.com/中使用PG 9.6版本。