我有一个开始日期和结束日期的日期范围,如
'2017-01-01', '2017-01-31'
'2017-01-04', '2017-02-20'
'2017-02-21', '2017-03-29'
'2017-03-17', '2017-04-12'
我需要输出
'2017-01-01', '2017-02-20'
'2017-02-21', '2017-04-12'
是否有可能仅使用postgres函数获取输出?有人可以帮助我。
答案 0 :(得分:0)
我觉得类似的问题被问过这么多次,我应该在可能的逻辑背后写一个解释,所以我再添加一行来介绍问题(否则你可以只比较对,而不是树):
t=# insert into t select '2017-01-08','2017-02-12';
INSERT 0 1
现在是"逻辑":
t=# with c as (select daterange(s,e), s,e, daterange(s,e) && lead(daterange(s,e)) over (order by s) ovps from t)
select *, case when not lag(ovps) over (order by s) or row_number() over (order by s) = 1 then s end strt, case when not ovps or row_number() over (order by s) = count(1) over () then e end fnsh
from c;
daterange | s | e | ovps | strt | fnsh
-------------------------+------------+------------+------+------------+------------
[2017-01-01,2017-01-31) | 2017-01-01 | 2017-01-31 | t | 2017-01-01 |
[2017-01-04,2017-02-20) | 2017-01-04 | 2017-02-20 | t | |
[2017-01-08,2017-02-12) | 2017-01-08 | 2017-02-12 | f | | 2017-02-12
[2017-02-21,2017-03-29) | 2017-02-21 | 2017-03-29 | t | 2017-02-21 |
[2017-03-17,2017-04-12) | 2017-03-17 | 2017-04-12 | | | 2017-04-12
(5 rows)
我知道我在这里发明了一个轮子,但有人应该展示它是如何工作的。因此,上面的 ovps 列显示范围是否重叠, strt 和 fnsh 分别是 s排序的结果范围的开始和结束(这是折叠前原始数据集的原始开始)...
下:
t=# with c as (select daterange(s,e), s,e, daterange(s,e) && lead(daterange(s,e)) over (order by s) ovps from t)
, n as (select *, case when not lag(ovps) over (order by s) or row_number() over (order by s) = 1 then s end strt, case when not ovps or row_number() over (order by s) = count(1) over () then e end fnsh
from c)
select daterange, strt, fnsh, dense_rank() over (order by strt), dense_rank() over (order by fnsh) from n order by s;
daterange | strt | fnsh | dense_rank | dense_rank
-------------------------+------------+------------+------------+------------
[2017-01-01,2017-01-31) | 2017-01-01 | | 1 | 3
[2017-01-04,2017-02-20) | | | 3 | 3
[2017-01-08,2017-02-12) | | 2017-02-12 | 3 | 1
[2017-02-21,2017-03-29) | 2017-02-21 | | 2 | 3
[2017-03-17,2017-04-12) | | 2017-04-12 | 3 | 2
(5 rows)
我们dense_rank
在订单中满足日期时获取(最后为NULL)。现在我们在满足价值时得到了订单,我们可以得到最小值和最大值:
t=# with c as (select daterange(s,e), s,e, daterange(s,e) && lead(daterange(s,e)) over (order by s) ovps from t)
, n as (select *, case when not lag(ovps) over (order by s) or row_number() over (order by s) = 1 then s end strt, case when not ovps or row_number() over (order by s) = count(1) over () then e end fnsh
from c)
, d as (select daterange, strt, fnsh, dense_rank() over (order by strt) ds, dense_rank() over (order by fnsh) de from n order by s)
select distinct min(strt) over (partition by least(ds,de)), max(fnsh) over (partition by least(ds,de)) from d;
min | max
------------+------------
|
2017-02-21 | 2017-04-12
2017-01-01 | 2017-02-12
(3 rows)
现在正确删除空行和顺序:
t=# with c as (select daterange(s,e), s,e, daterange(s,e) && lead(daterange(s,e)) over (order by s) ovps from t)
, n as (select *, case when not lag(ovps) over (order by s) or row_number() over (order by s) = 1 then s end strt, case when not ovps or row_number() over (order by s) = count(1) over () then e end fnsh
from c)
, d as (select daterange, strt, fnsh, dense_rank() over (order by strt) ds, dense_rank() over (order by fnsh) de from n order by s)
, f as (select distinct min(strt) over (partition by least(ds,de)), max(fnsh) over (partition by least(ds,de)) from d)
select min startc, max endc from f where min is not null order by min;
startc | endc
------------+------------
2017-01-01 | 2017-02-12
2017-02-21 | 2017-04-12
(2 rows)
最后,我非常确定你可以减少大部分"逻辑"在数学上使它整洁......