我在PostgreSQL 10.5中有一个表trips
:
id start_date end_date
----------------------------
1 02/01/2019 02/03/2019
2 02/02/2019 02/03/2019
3 02/06/2019 02/07/2019
4 02/06/2019 02/14/2019
5 02/06/2019 02/06/2019
我想计算出与给定星期重叠的旅行天数。表中的行程具有包含范围。几周从星期一开始,到星期日结束。预期结果将是:
week_of days_utilized
------------------------
01/28/19 5
02/04/19 8
02/11/19 4
有关日历的参考:
Monday 01/28/19 - Sunday 02/03/19
Monday 02/04/19 - Sunday 02/10/19
Monday 02/11/19 - Sunday 02/17/19
我知道如何用我使用的编程语言来编写代码,但是我更喜欢在Postgres中编写代码,而且不清楚从哪里开始...
答案 0 :(得分:3)
您似乎想要generate_series()
以及join
和group by
。要计算所涵盖的一周:
select gs.wk, count(t.id) as num_trips
from generate_series('2019-01-28'::date, '2019-02-11'::date, interval '1 week') gs(wk) left join
trips t
on gs.wk <= t.end_date and
gs.wk + interval '6 day' >= t.start_date
group by gs.wk
order by gs.wk;
编辑:
我看到你希望日子过得好。聚合中需要做的工作更多:
select gs.wk, count(t.id) as num_trips,
sum( 1 +
extract(day from (least(gs.wk + interval '6 day', t.end_date) - greatest(gs.wk, t.start_date)))
) as days_utilized
from generate_series('2019-01-28'::date, '2019-02-11'::date, interval '1 week') gs(wk) left join
trips t
on gs.wk <= t.end_date and
gs.wk + interval '6 day' >= t.start_date
group by gs.wk
order by gs.wk;
注意:这不会完全返回您得到的结果。我认为这些是正确的。
答案 1 :(得分:0)
为此,我会考虑range types。使用range operators使计算更简单明了-我在下面使用了重叠&&
和交集*
。如果表很大,我们可以使用功能性GiST or SP-GiST index来快速进行查询。喜欢:
CREATE INDEX trip_range_idx ON trip
USING gist (daterange(start_date, end_date, '[]'));
然后您的查询可以使用此索引:
SELECT week
, count(overlap) AS ct_trips
, sum(upper(overlap) - lower(overlap)) AS days_utilized
FROM (
SELECT week, trip * week AS overlap
FROM (
SELECT daterange(mon::date, mon::date + 7) AS week
FROM generate_series(timestamp '2019-01-28'
, timestamp '2019-02-11'
, interval '1 week') mon
) w
LEFT JOIN (SELECT daterange(start_date, end_date, '[]') FROM trip) t(trip) ON trip && week
) sub
GROUP BY 1
ORDER BY 1;
db <>提琴here
请注意,默认情况下,date_range
由包含的下限和包含 exclusive 的上限组成。 您的范围的上限和下限 include ,因此请使用以下内容创建daterange
:daterange(start_date, end_date, '[]')
。函数upper()
仍返回互斥上限。因此,表达式upper(overlap) - lower(overlap)
可以正确地计算天数。
我将generate_series()
与timestamp
输入一起使用是有原因的:
相关:
或者,如果您不想使用范围类型,请考虑使用OVERLAPS
运算符: