我们有很多人,这些人带着多个阶段/州进行旅行(最初计划,然后开始,然后返回_安全或以灾难告终)。
我有一个查询可以获得正确的结果,您可以在此处查看并使用它:
但是,我想知道是否有更好的实施,特别是避免使用GROUP BY
和postgres' bool_and
,也可能避免嵌套查询。
谁从未经历过他们没有安全返回的旅行?
或者,换句话说:
谁拥有:
1. Never planned or gone on a trip
或2. only ever returned safely
澄清
输出
至少至少 person
表中的所有列,如果其他列也出来,那就没问题了。
CREATE TABLE people (person_name text, gender text, age integer);
INSERT INTO people (person_name, gender, age)
VALUES ('pete', 'm', 10), ('alan', 'm', 22), ('jess', 'f', 24), ('agnes', 'f', 25), ('matt', 'm', 26);
CREATE TABLE trips (person_name text, trip_name text);
INSERT INTO trips (person_name, trip_name)
VALUES ('pete', 'a'),
('pete', 'b'),
('alan', 'c'),
('alan', 'd'),
('jess', 'e'),
('matt', 'f');
CREATE TABLE trip_stages (trip_name text, stage text, most_recent boolean);
INSERT INTO trip_stages
VALUES ('a', 'started', 'f'), ('a', 'disaster', 't'),
('b', 'started', 't'),
('c', 'started', 'f'), ('c', 'safe_return', 't'),
('e', 'started', 'f'), ('e', 'safe_return', 't');
person_name | gender | age
-------------+--------+-----
jess | f | 24
agnes | f | 25
SELECT people.* FROM people WHERE people.person_name IN (
SELECT people.person_name FROM people
LEFT OUTER JOIN trips
ON trips.person_name = people.person_name
LEFT OUTER JOIN trip_stages
ON trip_stages.trip_name = trips.trip_name AND trip_stages.most_recent = 't'
GROUP BY people.person_name
HAVING bool_and(trips.trip_name IS NULL)
OR bool_and(trip_stages.stage IS NOT NULL AND trip_stages.stage = 'safe_return')
)
SELECT people.* FROM people WHERE people.person_name IN (
-- All the people
SELECT people.person_name FROM people
-- + All their trips
LEFT OUTER JOIN trips
ON trips.person_name = people.person_name
-- + All those trips' stages
LEFT OUTER JOIN trip_stages
ON trip_stages.trip_name = trips.trip_name AND trip_stages.most_recent = 't'
-- Group by person
GROUP BY people.person_name
-- Filter to those rows where either:
-- 1. trip_name is always NULL (they've made no trips)
-- 2. Every trip has been ended with a safe return
HAVING bool_and(trips.trip_name IS NULL)
OR bool_and(trip_stages.stage IS NOT NULL AND trip_stages.stage = 'safe_return')
)
我可以用另一种方式编写此查询吗?不使用GROUP BY
和bool_and
,理想情况下不使用子查询?也许一些分区/窗口功能?
我用它来学习,所以对查询的解释/分析表示赞赏!
我对性能影响特别感兴趣。例如如果人们进行数千次旅行会发生什么?子查询是否通过其他方法执行了?