我有一个需要根据日期时间拆分的表
ID| Start | End
--------------------------------------------
A | 2019-03-04 23:18:04| 2019-03-04 23:21:25
--------------------------------------------
A | 2019-03-04 23:45:05| 2019-03-05 00:15:14
--------------------------------------------
必需的输出
ID| Start | End
--------------------------------------------
A | 2019-03-04 23:18:04| 2019-03-04 23:21:25
--------------------------------------------
A | 2019-03-04 23:45:05| 2019-03-04 23:59:59
--------------------------------------------
A | 2019-03-05 00:00:00| 2019-03-05 00:15:14
--------------------------------------------
谢谢!
答案 0 :(得分:1)
请尝试以下代码。仅当开始日期和结束日期连续两天时才有效。如果开始日期和结束日期之间的差额超过1天,则不会。
MSSQL:
SELECT ID,[Start],[End]
FROM Input_Table A
WHERE DATEDIFF(DD,[Start],[End]) = 0
UNION ALL
SELECT ID,[Start], CAST(CAST(CAST([Start] AS DATE) AS VARCHAR(MAX)) +' 23:59:59' AS DATETIME)
FROM Input_Table A
WHERE DATEDIFF(DD,[Start],[End]) > 0
UNION ALL
SELECT ID,CAST(CAST([End] AS DATE) AS DATETIME),[End]
FROM Input_Table A
WHERE DATEDIFF(DD,[Start],[End]) > 0
ORDER BY 1,2,3
PostgreSQL:
SELECT ID,
TO_TIMESTAMP(startDate,'YYYY-MM-DD HH24:MI:SS'),
TO_TIMESTAMP(endDate, 'YYYY-MM-DD HH24:MI:SS')
FROM mytemp A
WHERE DATE_PART('day', endDate::date) -
DATE_PART('day',startDate::date) = 0
UNION ALL
SELECT ID,
TO_TIMESTAMP(startDate,'YYYY-MM-DD HH24:MI:SS'),
TO_TIMESTAMP(CONCAT(CAST(CAST (startDate AS DATE) AS VARCHAR) ,
' 23:59:59') , 'YYYY-MM-DD HH24:MI:SS')
FROM mytemp A
WHERE DATE_PART('day', endDate::date) -
DATE_PART('day',startDate::date) > 0
UNION ALL
SELECT ID,
TO_TIMESTAMP(CAST(CAST (endDate AS DATE) AS VARCHAR) ,
'YYYY-MM-DD HH24:MI:SS') ,
TO_TIMESTAMP(endDate,'YYYY-MM-DD HH24:MI:SS')
FROM mytemp A
WHERE DATE_PART('day', endDate::date) -
DATE_PART('day',startDate::date) > 0;
PostgreSQL演示Here
答案 1 :(得分:0)
即使距离超过一天也可以使用
WITH cte AS (
SELECT
id,
start_time,
end_time,
gs,
lag(gs) over (PARTITION BY id ORDER BY gs) -- 2
FROM
a
LEFT JOIN LATERAL
generate_series(start_time::date + 1, end_time::date, interval '1 day') gs --1
ON TRUE
)
SELECT -- 3
id,
COALESCE(lag, start_time) AS start_time,
gs - interval '1 second'
FROM
cte
WHERE gs IS NOT NULL
UNION
SELECT DISTINCT ON (id) -- 4
id,
CASE WHEN start_time::date = end_time::date THEN start_time ELSE end_time::date END, -- 5
end_time
FROM
cte
generate_series
函数每天新一天产生一行。因此,没有日期更改就没有价值lag()
window function允许将当前日期值移到下一行(当前结束是下一个开始)gs
值:没有日期更改。这一点在这一点上被忽略了。对于所有具有日期更改的情况:如果没有lag
值,则它是开始值(因此无法获得先前的值)。在这种情况下,将采用常规的start_time
,否则,它是一个新的日期,需要花费日期休息时间。 end_time
是在一天的最后一秒(interval - '1 second'
)end_time
的开头(因此强制转换为date
)。 CASE
子句将此步骤与迄今为止没有被忽略的无日期更改的情况结合在一起。因此,如果start_time
和end_time
处于同一日期,则将采用原始的start_time
。答案 2 :(得分:0)
使用递归CTE模拟生成间隔的循环,即从种子行的开始到午夜取值范围,在随后的行取另一天等值
with recursive input as (
select 'A' as id, timestamp '2019-03-04 23:18:04' as s, timestamp '2019-03-04 23:21:25' as e union
select 'A' as id, timestamp '2019-03-04 23:45:05' as s, timestamp '2019-03-05 00:15:14' as e union
select 'B' as id, timestamp '2019-03-06 23:45:05' as s, timestamp '2019-03-08 00:15:14' as e union
select 'C' as id, timestamp '2019-03-10 23:45:05' as s, timestamp '2019-03-15 00:15:14' as e
), generate_id as (
select row_number() over () as unique_id, * from input
), rec (unique_id, id, s, e) as (
select unique_id, id, s, least(e, s::date::timestamp + interval '1 day')
from generate_id seed
union
select remaining.unique_id, remaining.id, previous.e, least(remaining.e, previous.e::date::timestamp + interval '1 day')
from rec as previous
join generate_id remaining on previous.unique_id = remaining.unique_id and previous.e < remaining.e
)
select id, s, e from rec
order by id,s,e
注意:
id
列似乎不是唯一的,因此我添加了自定义unique_id
列。如果id
是唯一的,则不需要CTE generate_id
。递归查询正常工作是不可避免的。更新:查询适用于Postgres。 OP最初标记了问题postgres,然后将其标记更改为redshift。
答案 3 :(得分:0)
不幸的是,Redshift没有方便的方法来生成一系列数字。如果表足够大,则可以使用它来生成数字。 “足够大”表示行数大于最长跨度。也许另一个表也可以,如果不是这样的话。
一旦有了,就可以使用以下逻辑:
with n as (
select row_number() over () - 1 as n
from t
)
select t.id,
greatest(t.s, date_trunc('day', t.s) + n.n * interval '1 day') as s,
least(t.e, date_trunc('day', t.s) + (n.n + 1) * interval '1 day' - interval '1 second') as e
from t join
n
on t.e >= date_trunc('day', t.s) + n.n * interval '1 day';
Here是db <>小提琴。它使用的是Postgres的旧版本,但对于Redshift来说还不够旧。