使用时间转换逻辑的高级SQL分析问题

时间:2018-11-04 02:41:06

标签: sql analytics

我具有以下表格结构

Create Trains (
id int primary key,
origin varchar not null,
dest varchar not null,
departure_time varchar not null)

Create Passengers(
id int primary key,
origin varchar not null,
dest varchar not null,
departure_time varchar not null)

Trains
id  origin  dest  time
10  Beg     knp   10:20
20  Beg     knp   7:40
30  Del     Sin   12:05
40  Ghr     poh   13:40
50  Del     Sin   18:05

Passengers
id  origin  dest  time
101  Beg     knp   10:20
201  Beg     knp   7:00
301  Del     Sin   12:00
401  Ghr     poh   13:45
501  Del     Sin   19:05

我正在尝试查找每列火车旅行的乘客人数

假设

  1. 即使乘客在相同的出发时间到达车站,也可以登上火车。也就是说,即使他在12.05到达车站,他也可以登上12.05出发的火车

  2. 没有火车可以具有相同的出发地,目的地和出发时间。

  3. 乘客将在出发时间后赶上最早的火车

  4. 火车的始发地和目的地之间没有其他站点。

  5. 乘客将只能搭乘从其出发地到目的地的直达火车。

任何人都可以向我解释如何解决这个问题吗?

我在下面的查询中写了一个

select t.id,count(p.id) 
from p.passengers,t.trains 
where t.origin=p.origin and t.dest=p.dest 
    and cast(p.departure_time as time)<=cast(t.departure_time as time)

我认为我没有考虑第三个假设。

3 个答案:

答案 0 :(得分:4)

实际上没有简单的方法可以做到这一点。

您当前查询的问题是它将乘客分配给所有在p.departure_time之后有t.departure_time的列车,但是我们只希望乘客在p.departure_time之后赶上第一列火车。

首先计算每个乘客可以赶上的所有火车,然后将其范围缩小。

查询以选择乘客可以搭乘的所有火车:

SELECT *
FROM passengers p
LEFT JOIN trains t ON t.origin = p.origin AND t.dest = p.dest
    AND CAST(t.departure_time AS TIME) >= CAST(p.departure_time AS TIME)

我们使用上面的查询,将其按passanger.id分组,然后选择最短出发时间(即,第一趟列车在乘客的走行时间之后或与之同时出发)。我们将此查询称为将捕获

SELECT p.id AS pid, MIN(CAST(t.departure_time AS TIME)) AS t_deptime
FROM passengers p
LEFT JOIN trains t ON t.origin = p.origin AND t.dest = p.dest
    AND CAST(t.departure_time AS TIME) >= CAST(p.departure_time AS TIME)
GROUP BY p.id

最后的查询如下:

; WITH will_catch AS (
    SELECT p.id  pid, MIN(CAST(t.departure_time AS TIME)) AS t_deptime
    FROM passengers p
    LEFT JOIN trains t ON t.origin = p.origin AND t.dest = p.dest
        AND CAST(t.departure_time AS TIME) >= CAST(p.departure_time AS TIME)
    GROUP BY p.id
)
SELECT t.id, COUNT(t_deptime)
FROM will_catch wc
LEFT JOIN passengers p ON wc.pid = p.id
LEFT JOIN trains t ON (p.origin = t.origin AND p.dest = t.dest) 
    AND (wc.t_deptime = CAST(t.departure_time AS TIME) OR t_deptime IS NULL)
GROUP BY t.id

答案 1 :(得分:0)

您可以使用相关子查询:

select p.*,
       (select t.id
        from trains t
        where t.origin = p.origin and t.dest = p.dest and
              cast(t.time as time) >= cast(p.time as time)
        order by cast(t.time as time) desc
        fetch first 1 row only
       ) as train_id
from passengers p;

请注意,这使用ANSI / ISO语法从子查询中获取一行。在特定的数据库(top 1limit 1等)中可能会有所不同。

答案 2 :(得分:0)

select t.id, count(sub2.pass_id) no_of_passenger from trains t 
left outer join 
(select min(sub.t2) time, sub.id1 pass_id, sub.org, sub.des 
from 
(select p.id id1, t.id id2, t.origin org, t.destination des, p.time t1 , t.time t2 from trains t join
passengers p 
on t.origin = p.origin and t.destination = p.destination and t.time >= p.time
order by p.id, t.time)sub
group by sub.id1, sub.org, sub.des) sub2 
on t.origin = sub2.org and t.destination = sub2.des  and t.time = sub2.time
group by t.id
order by t.id
;