我目前正试图弄清楚如何使用涉及空值的左连接进行过滤。这是一个简化的 我正在处理的架构版本:
CREATE TABLE bookclubs (
bookclub_id UUID NOT NULL PRIMARY KEY
);
CREATE TABLE books (
bookclub_id UUID NOT NULL,
book_id UUID NOT NULL
);
ALTER TABLE books ADD CONSTRAINT books_pk PRIMARY KEY(bookclub_id, book_id);
ALTER TABLE books ADD CONSTRAINT book_to_bookclub FOREIGN KEY(bookclub_id)
REFERENCES bookclubs(bookclub_id) ON UPDATE NO ACTION ON DELETE CASCADE;
CREATE INDEX books_bookclub_index ON books (bookclub_id);
CREATE TABLE book_reviews (
bookclub_id UUID NOT NULL,
book_id UUID NOT NULL,
reviewer_id TEXT NOT NULL,
rating int8 NOT NULL
);
ALTER TABLE book_reviews ADD CONSTRAINT book_reviews_pk PRIMARY KEY(bookclub_id, book_id, reviewer_id);
ALTER TABLE book_reviews ADD CONSTRAINT book_review_to_book FOREIGN KEY(bookclub_id,book_id)
REFERENCES books(bookclub_id,book_id) ON UPDATE NO ACTION ON DELETE CASCADE;
CREATE INDEX book_review_to_book_index ON book_reviews ( bookclub_id, book_id);
CREATE INDEX book_review_by_reviewer ON book_reviews ( bookclub_id, reviewer_id, rating);
我想要一个查询,对于给定的bookclub_id
和reviewer_id
,我会将所有他们评为> = 3的书籍归还给我,或者他们没有评分。他们没有被评级的图书在book_reviews
表中没有条目,这是我无能为力的事情。 rating
实际上是一个枚举,如果它是相关的,但我不认为它。
我做这件事的第一次尝试失败了:
SELECT *
FROM books
LEFT OUTER JOIN book_reviews
ON ( ( ( books.bookclub_id = book_reviews.bookclub_id )
AND ( books.book_id = book_reviews.book_id ) )
AND ( book_reviews.reviewer_id = 'alice' ) )
WHERE books.bookclub_id = '00000000-0000-0000-0000-000000000000'
AND book_reviews.rating != 1
AND book_reviews.rating != 2;
这会删除那些没有来自用户的评论的书籍,这在我考虑WHERE
条件如何实际实施后会有所帮助。这是查询计划
Nested Loop (cost=0.30..16.39 rows=1 width=104)
-> Index Scan using book_reviews_pk on book_reviews (cost=0.15..8.21 rows=1 width=72)
Index Cond: ((bookclub_id = '00000000-0000-0000-0000-000000000000'::uuid) AND (reviewer_id = 'alice'::text))
Filter: ((rating <> 1) AND (rating <> 2))
-> Index Only Scan using books_pk on books (cost=0.15..8.17 rows=1 width=32)
Index Cond: ((bookclub_id = '00000000-0000-0000-0000-000000000000'::uuid) AND (book_id = book_reviews.book_id))
所以我添加了一个null的显式检查:
SELECT *
FROM books
LEFT OUTER JOIN book_reviews
ON ( ( ( books.bookclub_id = book_reviews.bookclub_id )
AND ( books.book_id = book_reviews.book_id ) )
AND ( book_reviews.reviewer_id = 'alice' ) )
WHERE books.bookclub_id = '00000000-0000-0000-0000-000000000000'
AND book_reviews.rating IS NULL
OR ( book_reviews.rating != 1
AND book_reviews.rating != 2);
这会返回正确的结果,但看起来非常低效,并且会使数据库停止运行。这是查询计划
Hash Left Join (cost=18.75..52.56 rows=1346 width=104)
Hash Cond: ((books.bookclub_id = book_reviews.bookclub_id) AND (books.book_id = book_reviews.book_id))
Filter: (((books.bookclub_id = '00000000-0000-0000-0000-000000000000'::uuid) AND (book_reviews.rating IS NULL)) OR ((book_reviews.rating <> 1) AND (book_reviews.rating <> 2)))
-> Seq Scan on books (cost=0.00..23.60 rows=1360 width=32)
-> Hash (cost=18.69..18.69 rows=4 width=72)
-> Bitmap Heap Scan on book_reviews (cost=10.23..18.69 rows=4 width=72)
Recheck Cond: (reviewer_id = 'alice'::text)
-> Bitmap Index Scan on book_review_by_reviewer (cost=0.00..10.22 rows=4 width=0)
Index Cond: (reviewer_id = 'alice'::text)
我没有解释这些事情的专家,但Filter
移到外面似乎很糟糕。有没有一种有效的方法来构造查询,以便我可以得到我想要的结果?感谢
答案 0 :(得分:0)
将过滤器移动到连接条件:
SELECT *
FROM
books
LEFT OUTER JOIN
book_reviews ON
books.bookclub_id = book_reviews.bookclub_id
AND books.book_id = book_reviews.book_id
AND book_reviews.reviewer_id = 'alice'
AND book_reviews.rating != 1
AND book_reviews.rating != 2
WHERE books.bookclub_id = '00000000-0000-0000-0000-000000000000'
或者更短一些:
AND book_reviews.rating not in (1, 2)
答案 1 :(得分:0)
我相信我们已经明白了。我们在IBar
子句中遗漏了一组parens:
WHERE
没有它,布尔逻辑关联错误。此查询返回正确的结果并具有合理的查询计划,因此看起来这是整个问题。谢谢你的期待。