在雪花中生成日期范围

时间:2021-05-25 09:36:56

标签: sql date time-series snowflake-cloud-data-platform date-range

我想在两个时间戳之间创建一个日期范围。我看到了类似的帖子,还检查了 this 方法。但是,仍然无法达到下面的预期输出。

请注意,如果 ended_at 为 NULL,则需要取 CURRENT_TIMESTAMP

示例数据:

WITH t1 AS (
SELECT 'A' AS id, '2021-05-18 18:30:00'::timestamp AS started_at, '2021-05-19 09:45:00'::timestamp AS ended_at UNION ALL
SELECT 'B' AS id, '2021-05-24 11:30:40'::timestamp AS started_at, NULL::timestamp AS ended_at
    )
SELECT *
FROM t1

预期结果:

enter image description here

2 个答案:

答案 0 :(得分:2)

生成长度为 datedfiff 的空格数组,拆分数组并展平以生成行。使用索引作为添加到开始日期的天数:

WITH t1 AS (
SELECT 'A' AS id, '2021-05-18 18:30:00'::timestamp AS started_at, '2021-05-19 09:45:00'::timestamp AS ended_at UNION ALL
SELECT 'B' AS id, '2021-05-24 11:30:40'::timestamp AS started_at, NULL::timestamp AS ended_at
    )
    
SELECT t1.*, dateadd(day, v.index, to_date(t1.started_at)) as date_generated 
FROM t1, 
     lateral flatten (split(space(datediff(day,to_date(t1.started_at), nvl(to_date(t1.ended_at), current_date))),' ')) v
;

结果:

ID  STARTED_AT              ENDED_AT                DATE_GENERATED
A   2021-05-18 18:30:00.000 2021-05-19 09:45:00.000 2021-05-18
A   2021-05-18 18:30:00.000 2021-05-19 09:45:00.000 2021-05-19
B   2021-05-24 11:30:40.000 null                    2021-05-24
B   2021-05-24 11:30:40.000 null                    2021-05-25

答案 1 :(得分:1)

如果您需要生成的天数相对较少,则可以使用递归 CTE:

WITH t1 AS (
    SELECT
        'A'                              AS id,
        '2021-05-18 18:30:00'::timestamp AS started_at,
        '2021-05-19 09:45:00'::timestamp AS ended_at
    UNION ALL
    SELECT
        'B'                              AS id,
        '2021-05-24 11:30:40'::timestamp AS started_at,
        NULL::timestamp                  AS ended_at
),
     row_gen (id, started, ended, generated_day) as (
         select
             t1.id id,
             t1.started_at::date started,
             coalesce(t1.ended_at, current_timestamp)::date ended,
             started generated_day
         from t1
         union all
         select
             id,
             started,
             ended,
             dateadd('day', 1, generated_day)
         from row_gen
         where generated_day < ended
     )
SELECT *
FROM row_gen
+--+----------+----------+-------------+
|ID|STARTED   |ENDED     |GENERATED_DAY|
+--+----------+----------+-------------+
|A |2021-05-18|2021-05-19|2021-05-18   |
|A |2021-05-18|2021-05-19|2021-05-19   |
|B |2021-05-24|2021-05-25|2021-05-24   |
|B |2021-05-24|2021-05-25|2021-05-25   |
+--+----------+----------+-------------+