Question

我在每一行都有每日日期，包括开始和结束日期，以及它们是否适用于此查询。我需要返回所有无间隔的天间隔。

示例表：

+----+---------------------+---------------------+--------+--------------------------------------+
| ID | DATE_START          | DATE_END            | STATUS | COMMENT                              |
|----+---------------------+---------------------+--------+--------------------------------------+
|  1 | 2019-01-01 00:00:00 | 2019-01-02 00:00:00 | 1      |                                      |
|----+---------------------+---------------------+--------+--------------------------------------+
|  2 | 2019-01-02 00:00:00 | 2019-01-03 00:00:00 | 1      |                                      |
|----+---------------------+---------------------+--------+--------------------------------------+|
|    |                     |                     |        | <-- did this gap visually, following |
|    |                     |                     |        | <-- dates are more than 1 day off    |
|----+---------------------+---------------------+--------+--------------------------------------+
| 10 | 2019-02-07 06:00:00 | 2019-02-08 06:00:00 | 1      |                                      |
|----+---------------------+---------------------+--------+--------------------------------------+
| 11 | 2019-02-08 06:00:00 | 2019-02-09 06:00:00 | 1      |                                      |
|----+---------------------+---------------------+--------+--------------------------------------+
| 12 | 2019-02-09 06:00:00 | 2019-02-10 06:00:00 | 1      |                                      |
|----+---------------------+---------------------+--------+--------------------------------------+
| 13 | 2019-02-10 06:00:00 | 2019-02-11 06:00:00 | 0      | <-- gap, as STATUS=0                 |
|----+---------------------+---------------------+--------+--------------------------------------+
| 14 | 2019-02-11 06:00:00 | 2019-02-12 06:00:00 | 1      |                                      |
|----+---------------------+---------------------+--------+--------------------------------------+

结果表应如下所示：

+---------------------+---------------------+----------+
| INTERVAL_START      | INTERVAL_END        | IDS      |
+---------------------+---------------------+----------+
| 2019-01-01 00:00:00 | 2019-01-03 00:00:00 | 1,2      |
+---------------------+---------------------+----------+
| 2019-02-07 06:00:00 | 2019-02-10 06:00:00 | 10,11,12 |
+---------------------+---------------------+----------+
| 2019-02-11 06:00:00 | 2019-02-12 06:00:00 | 14       |
+---------------------+---------------------+----------+

好，选择STATUS<>0是否正确。我所挣扎的是，我不知道该如何开始进行递归查找，如果第二天也可用，如果可以，继续直到第二天没有（并收集这些天的ID）。

由于此查询将包含大量其他数据，因此这根本不是问题。我只是无法绕过这种递归方法。

如果这是尽可能获得的标准SQL，将非常有帮助，因为将来可能会移植此查询。

编辑：哦，正如您在此处看到的时间戳记一样，DATE_START始终与前一天的DATE_END相同的小时/分钟（如果存在）。

Answer 1

Oracle设置：

CREATE TABLE test_data ( ID, DATE_START, DATE_END, STATUS ) AS
  SELECT  1, DATE '2019-01-01', DATE '2019-01-02', 1 FROM DUAL UNION ALL
  SELECT  2, DATE '2019-01-02', DATE '2019-01-03', 1 FROM DUAL UNION ALL
  SELECT 10, DATE '2019-01-07' + INTERVAL '6' HOUR, DATE '2019-01-08' + INTERVAL '6' HOUR, 1 FROM DUAL UNION ALL
  SELECT 11, DATE '2019-01-08' + INTERVAL '6' HOUR, DATE '2019-01-09' + INTERVAL '6' HOUR, 1 FROM DUAL UNION ALL
  SELECT 12, DATE '2019-01-09' + INTERVAL '6' HOUR, DATE '2019-01-10' + INTERVAL '6' HOUR, 1 FROM DUAL UNION ALL
  SELECT 13, DATE '2019-01-10' + INTERVAL '6' HOUR, DATE '2019-01-11' + INTERVAL '6' HOUR, 0 FROM DUAL UNION ALL
  SELECT 14, DATE '2019-01-11' + INTERVAL '6' HOUR, DATE '2019-01-12' + INTERVAL '6' HOUR, 1 FROM DUAL

查询：

SELECT MIN( date_start ) AS date_start,
       MAX( date_end   ) AS date_end,
       LISTAGG( id, ',' ) WITHIN GROUP ( ORDER BY date_start, date_end ) AS ids
FROM   (
  SELECT id,
         date_start,
         date_end,
         status,
         SUM( change_group ) OVER ( ORDER BY date_start, date_end )
           AS group_id
  FROM   (
    SELECT t.*,
           CASE
           WHEN date_start = LAG( date_end ) OVER ( ORDER BY date_start, date_end )
           AND  1          = LAG( status   ) OVER ( ORDER BY date_start, date_end )
           AND  1          = status
           THEN 0
           ELSE 1
           END AS change_group
    FROM   test_data t
  )
  WHERE  status = 1
)
GROUP BY group_id

输出：

DATE_START          | DATE_END            | IDS     
:------------------ | :------------------ | :-------
2019-01-01 00:00:00 | 2019-01-03 00:00:00 | 1,2     
2019-01-07 06:00:00 | 2019-01-10 06:00:00 | 10,11,12
2019-01-11 06:00:00 | 2019-01-12 06:00:00 | 14

db <>提琴here

Answer 2

MT0是正确的，尽管我认为count ... filter子句比sum ... case更容易阅读：

with t as (
  select 1 as id, timestamp '2019-01-01 00:00:00' as date_start, timestamp '2019-01-02 00:00:00' as date_end, 1 as status union
  select 2 as id, timestamp '2019-01-02 00:00:00' as date_start, timestamp '2019-01-03 00:00:00' as date_end, 1 as status union
  select 10 as id, timestamp '2019-01-07 06:00:00' as date_start, timestamp '2019-01-08 06:00:00' as date_end, 1 as status union
  select 11 as id, timestamp '2019-01-08 06:00:00' as date_start, timestamp '2019-01-09 06:00:00' as date_end, 1 as status union
  select 12 as id, timestamp '2019-01-09 06:00:00' as date_start, timestamp '2019-01-10 06:00:00' as date_end, 1 as status union
  select 13 as id, timestamp '2019-01-10 06:00:00' as date_start, timestamp '2019-01-11 06:00:00' as date_end, 0 as status union
  select 14 as id, timestamp '2019-01-11 06:00:00' as date_start, timestamp '2019-01-12 06:00:00' as date_end, 1 as status
), t2 as (
  select t.*, lag(date_end) over (order by date_start) as prev_date_end
  from t
  where status = 1
), t3 as (
  select t2.*, count(1) filter (where date_start is distinct from prev_date_end) over (order by date_start) as g
  from t2
)
select min(date_start), max(date_end), string_agg(cast(id as text),',') from t3
group by g
order by g

在https://www.db-fiddle.com/中使用PG 9.6版本。

如何选择没有间隔的DATE间隔？

2 个答案: