我正在使用PostgreSQL 8.3.8。
我有一个时间边界列表(按日期),在time_boundaries表中:
CREATE TABLE role_times_boundaries
(
role_date DATE,
time_boundary TIME
);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-24'::date, '09:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-24'::date, '10:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '07:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '08:50:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '09:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '12:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '13:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '16:00:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '17:30:00'::time);
INSERT INTO role_times_boundaries (role_date, time_boundary) VALUES ('2013-04-25'::date, '20:00:00'::time);
所以,我有这个表内容:
role_date | time_boundary
------------+---------------
2013-04-24 | 09:00:00
2013-04-24 | 10:00:00
2013-04-25 | 07:00:00
2013-04-25 | 08:50:00
2013-04-25 | 09:00:00
2013-04-25 | 12:00:00
2013-04-25 | 13:00:00
2013-04-25 | 16:00:00
2013-04-25 | 17:30:00
2013-04-25 | 20:00:00
我想通过将每个time_boundary作为“start_time”,并将下一个time_boundary(按顺序)作为同一日期,在“role_times_boundaries”上进行自我内部联接来构建“时间片列表”表。 目标是获得这样的结果:
role_date | start_time | end_time
------------+------------+----------
2013-04-24 | 09:00:00 | 10:00:00
2013-04-25 | 07:00:00 | 08:50:00
2013-04-25 | 08:50:00 | 09:00:00
2013-04-25 | 09:00:00 | 12:00:00
2013-04-25 | 12:00:00 | 13:00:00
2013-04-25 | 13:00:00 | 16:00:00
2013-04-25 | 16:00:00 | 17:30:00
2013-04-25 | 17:30:00 | 20:00:00
我试图通过这个SQL查询获得希望的结果
SELECT role_times_boundaries.role_date,
role_times_boundaries.time_boundary AS start_time,
end_time_boundaries.time_boundary AS end_time
FROM role_times_boundaries
INNER JOIN (
SELECT role_date,
time_boundary
FROM role_times_boundaries
) AS end_time_boundaries ON (
role_times_boundaries.role_date = end_time_boundaries.role_date
AND end_time_boundaries.time_boundary = (
SELECT MIN(a_list_of_end_boundaries.time_boundary)
FROM role_times_boundaries AS a_list_of_end_boundaries
WHERE a_list_of_end_boundaries.time_boundary > role_times_boundaries.time_boundary
)
)
结果如下:
role_date | start_time | end_time
------------+------------+----------
2013-04-24 | 09:00:00 | 10:00:00
2013-04-25 | 07:00:00 | 08:50:00
2013-04-25 | 08:50:00 | 09:00:00
2013-04-25 | 12:00:00 | 13:00:00
2013-04-25 | 13:00:00 | 16:00:00
2013-04-25 | 16:00:00 | 17:30:00
2013-04-25 | 17:30:00 | 20:00:00
如果你看得见,那么 09:00:00到12:00:00 的时间片就会丢失! 但我仍然不明白为什么,仍然没有找到我的错误。
答案 0 :(得分:3)
如果升级到PostgreSQL 8.4或更高版本,则可以使用window functions
(Oracle术语中的“分析函数”),例如rank()
,row_number()
,lead()
和lag()
:
SELECT tb.role_date AS role_date
, tb.time_boundary AS start_time
, LEAD (time_boundary) OVER www AS end_time
FROM role_times_boundaries tb
WINDOW www AS (PARTITION BY tb.role_date ORDER BY tb.time_boundary)
;
或前面查询的其他等价物:
SELECT tb.role_date AS role_date
, tb.time_boundary AS start_time
, LEAD (time_boundary) OVER ( PARTITION BY tb.role_date ORDER BY tb.time_boundary) AS end_time
FROM role_times_boundaries tb;
会给你以下结果集:
role_date | start_time | end_time
------------+------------+----------
2013-04-24 | 09:00:00 | 10:00:00
2013-04-24 | 10:00:00 |
2013-04-25 | 07:00:00 | 08:50:00
2013-04-25 | 08:50:00 | 09:00:00
2013-04-25 | 09:00:00 | 12:00:00
2013-04-25 | 12:00:00 | 13:00:00
2013-04-25 | 13:00:00 | 16:00:00
2013-04-25 | 16:00:00 | 17:30:00
2013-04-25 | 17:30:00 | 20:00:00
2013-04-25 | 20:00:00 |
(10 rows)
要删除没有end_time
的句点,可以将其包装到子查询中:
SELECT role_date , start_time , end_time
FROM (
SELECT tb.role_date AS role_date
, tb.time_boundary AS start_time
, LEAD (time_boundary) OVER ( PARTITION BY tb.role_date ORDER BY tb.time_boundary) AS end_time
FROM role_times_boundaries tb
) sq
WHERE sq.start_time <= sq.end_time;
然后会给你以下结果:
role_date | start_time | end_time
------------+------------+----------
2013-04-24 | 09:00:00 | 10:00:00
2013-04-25 | 07:00:00 | 08:50:00
2013-04-25 | 08:50:00 | 09:00:00
2013-04-25 | 09:00:00 | 12:00:00
2013-04-25 | 12:00:00 | 13:00:00
2013-04-25 | 13:00:00 | 16:00:00
2013-04-25 | 16:00:00 | 17:30:00
2013-04-25 | 17:30:00 | 20:00:00
(8 rows)
更新:另一个替代查询,避免使用WINDOW函数,通过使用NOT EXISTS
关键字解决问题:
SELECT lo.role_date
, lo.time_boundary AS start_time
, hi.time_boundary AS end_time
FROM role_times_boundaries lo
JOIN role_times_boundaries hi
ON lo.role_date = hi.role_date
AND lo.time_boundary < hi.time_boundary
AND NOT EXISTS ( -- eliminate the men in the middle ...
SELECT * FROM role_times_boundaries nx
WHERE nx.role_date = hi.role_date
AND nx.time_boundary > lo.time_boundary
AND nx.time_boundary < hi.time_boundary
);
答案 1 :(得分:2)
好的,首先让我们简化一下你的查询:
SELECT
l.role_date,
l.time_boundary AS start_time,
r.time_boundary AS end_time
FROM role_times_boundaries l
INNER JOIN role_times_boundaries AS r ON ( -- You don't need that inner query, it's redundant
l.role_date = r.role_date
AND r.time_boundary = (
SELECT MIN(r2.time_boundary)
FROM role_times_boundaries AS r2
WHERE r2.time_boundary > l.time_boundary))
现在问题是您要比较r2中的所有 time_boundarie
,而不是角色日期限制的那些,因此corrected query将是:
SELECT
l.role_date,
l.time_boundary AS start_time,
r.time_boundary AS end_time
FROM role_times_boundaries l
INNER JOIN role_times_boundaries AS r ON (
l.role_date = r.role_date
AND r.time_boundary = (
SELECT MIN(r2.time_boundary)
FROM role_times_boundaries AS r2
-- Note the added restriction:
WHERE r2.time_boundary > l.time_boundary and r2.role_date = l.role_date))
following也适用于您的用例,可能更具可读性:
select
l.role_date as role_date,
l.time_boundary as start_time,
min(r.time_boundary) as end_time
from role_times_boundaries l
join role_times_boundaries r on
r.role_date = l.role_date
and r.time_boundary > l.time_boundary
group by l.role_date, l.time_boundary
order by l.role_date, l.time_boundary