我有一个表passenger_log
,其中包含四列psngr_id
(不一定是唯一的),arrival_dt_tm
,departure_dt_tm
和timediff
(最后一位乘客之间的差异)出发日期和下次抵达日期)。
从这张表中,我希望只能选择时差小于24小时的行,并选择下一行。
我在使用SQL 2008,因此无法真正使用最新版本LEAD
和LAG
函数。
现有表格 -
local_passenger_id arrival_date_time departure_date_time timediff
-----------------------------------------------------------------
00F9P0L193 28/07/2013 23:30 28/07/2013 23:36 0.516666
00F9P0L193 29/07/2013 00:07 29/07/2013 03:16 NULL
01MLEMFDUK 12/10/2008 16:43 12/10/2008 20:28 21.116666
01MLEMFDUK 13/10/2008 17:35 13/10/2008 20:19 3889.9
01MLEMFDUK 24/03/2009 22:13 25/03/2009 01:23 32268
01MLEMFDUK 28/11/2012 13:23 28/11/2012 14:10 676.583333
01MLEMFDUK 26/12/2012 18:45 26/12/2012 19:27 20.433333
01MLEMFDUK 27/12/2012 15:53 27/12/2012 17:07 289.2
01MLEMFDUK 08/01/2013 18:19 08/01/2013 19:00 NULL
02CJHH6KAJ 27/02/2011 09:48 27/02/2011 11:56 10167.25
02CJHH6KAJ 26/04/2012 03:11 26/04/2012 06:42 44.566666
02CJHH6KAJ 28/04/2012 03:16 28/04/2012 07:06 23.233333
02CJHH6KAJ 29/04/2012 06:20 29/04/2012 09:45 NULL
期望的输出 -
local_passenger_id arrival_date_time departure_date_time timediff_mins
----------------------------------------------------------------------
00F9P0L193 28/07/2013 23:30 28/07/2013 23:36 0.516666
00F9P0L193 29/07/2013 00:07 29/07/2013 03:16 NULL
01MLEMFDUK 12/10/2008 16:43 12/10/2008 20:28 21.116666
01MLEMFDUK 13/10/2008 17:35 13/10/2008 20:19 3889.9
01MLEMFDUK 26/12/2012 18:45 26/12/2012 19:27 20.433333
01MLEMFDUK 27/12/2012 15:53 27/12/2012 17:07 289.2
02CJHH6KAJ 28/04/2012 03:16 28/04/2012 07:06 23.233333
02CJHH6KAJ 29/04/2012 06:20 29/04/2012 09:45 NULL
答案 0 :(得分:1)
试试这个
WITH cte(id,pid,adt,ddt,tdif) AS
(SELECT ROW_NUMBER() OVER (ORDER BY pid,adt), pid,adt,ddt,tdif FROM pass),
hit(id) AS
(SELECT id FROM cte c1 WHERE EXISTS
(SELECT 1 FROM cte c2 WHERE c2.id=c1.id+1 AND c2.pid=c1.pid AND c1.tdif<24) )
SELECT pid,adt,ddt,tdif FROM cte WHERE id IN
(SELECT id FROM hit UNION SELECT id+1 FROM hit)
基于表格
INSERT INTO pass
([pid], [adt], [ddt], [tdif])
VALUES
('00F9P0L193', '28.07.2013 23:30', '28.07.2013 23:36', '0.516666'),
('00F9P0L193', '29.07.2013 00:07', '29.07.2013 03:16', NULL),
('01MLEMFDUK', '12.10.2008 16:43', '12.10.2008 20:28', '21.116666'),
('01MLEMFDUK', '13.10.2008 17:35', '13.10.2008 20:19', '3889.9'),
('01MLEMFDUK', '24.03.2009 22:13', '25.03.2009 01:23', '32268'),
('01MLEMFDUK', '28.11.2012 13:23', '28.11.2012 14:10', '676.583333'),
('01MLEMFDUK', '26.12.2012 18:45', '26.12.2012 19:27', '20.433333'),
('01MLEMFDUK', '27.12.2012 15:53', '27.12.2012 17:07', '289.2'),
('01MLEMFDUK', '08.01.2013 18:19', '08.01.2013 19:00', NULL),
('02CJHH6KAJ', '27.02.2011 09:48', '27.02.2011 11:56', '10167.25'),
('02CJHH6KAJ', '26.04.2012 03:11', '26.04.2012 06:42', '44.566666'),
('02CJHH6KAJ', '28.04.2012 03:16', '28.04.2012 07:06', '23.233333'),
('02CJHH6KAJ', '29.04.2012 06:20', '29.04.2012 09:45', NULL)
;
公用表表达式:
cte
是一个带编号的列表(我在所有行上使用row_number()
,无论乘客ID如何)hit
是一个只包含 行的表,其中有一行,后面的行与同一名乘客的时差为<24h。在主select
中,我使用带有hit
SELECT
将来自UNION
和紧接的后一行的行绑定在一起
输出:
pid, adt, ddt, tdif
00F9P0L193 28.07.2013 23:30 28.07.2013 23:36 0.516666
00F9P0L193 29.07.2013 00:07 29.07.2013 03:16
01MLEMFDUK 12.10.2008 16:43 12.10.2008 20:28 21.116666
01MLEMFDUK 13.10.2008 17:35 13.10.2008 20:19 3889.9
01MLEMFDUK 26.12.2012 18:45 26.12.2012 19:27 20.433333
01MLEMFDUK 27.12.2012 15:53 27.12.2012 17:07 289.2
02CJHH6KAJ 28.04.2012 03:16 28.04.2012 07:06 23.233333
02CJHH6KAJ 29.04.2012 06:20 29.04.2012 09:45
只需一个CTE也可以做到这一点:
;WITH cte AS
(SELECT ROW_NUMBER() OVER (ORDER BY pid,adt) id,pid,adt,ddt,tdif FROM pass)
SELECT pid,adt,ddt,tdif FROM cte c0 WHERE EXISTS (
SELECT 1 FROM cte c1 WHERE c1.id IN(c0.id,c0.id-1) AND EXISTS
(SELECT 1 FROM cte c2 WHERE c2.id=c1.id+1 AND c2.pid=c1.pid AND c1.tdif<24) )
可能不那么容易阅读,但工作方式相同......
答案 1 :(得分:0)
SELECT T.local_passenger_id, T.arrival_date_time, T.departure_date_time, T.timediff FROM passenger_log T
WHERE T.timediff < 24
UNION
SELECT T2.local_passenger_id, T2.arrival_date_time, T2.departure_date_time, T2.timediff FROM
(SELECT *, RANK() OVER (PARTITION BY local_passenger_id ORDER BY arrival_date_time) AS Ranked FROM passenger_log) T1
LEFT JOIN
(SELECT *, RANK() OVER (PARTITION BY local_passenger_id ORDER BY arrival_date_time) AS Ranked FROM passenger_log) T2
ON T1.local_passenger_id = T2.local_passenger_id AND T1.Ranked + 1 = T2.Ranked
WHERE T1.timediff < 24
ORDER BY local_passenger_id, arrival_date_time