考虑到此SQLFiddle上的架构,我正在尝试使用以下查询检索两个用户之间的最后一条消息:
SELECT DISTINCT ON ("user_id") *
FROM
(
(
SELECT DISTINCT ON ("user_id")
"id",
"recipient_id" AS "user_id",
"body",
"read",
"created_at"
FROM "messages"
WHERE "sender_id" = 1
ORDER BY "user_id", "created_at" DESC
)
UNION ALL
(
SELECT DISTINCT ON ("user_id")
"id",
"sender_id" AS "user_id",
"body",
"read",
"created_at"
FROM "messages"
WHERE "recipient_id" = 1
ORDER BY "user_id", "created_at" DESC
)
) AS "messages"
INNER JOIN "users" ON ("users"."id" = "messages"."user_id")
ORDER BY "user_id", "messages"."created_at" DESC
LIMIT 20;
它按预期工作,并且当给定用户没有太多消息时非常快,但是当消息数量增加时,如果消息体很大,则执行时间变得慢得多。分析执行计划会发现'瓶颈'在这两个子查询的ORDER BY上,因为它必须在内存中排序大约10k行。
在对这个查询进行了5个小时的努力之后,我一直无法找到更快的方法来实现我想要的目标。我曾尝试在(sender_id,created_at DESC)和(recipient_id,created_at DESC)上添加索引,但显然它似乎没有帮助。
那么,我做错了什么?
谢谢
PS:这是实施的执行计划:http://explain.depesz.com/s/0aE
答案 0 :(得分:1)
我的两个提示:
body
,read
,username
,name
,并将其与新包装查询中的结果相关联。 很抱歉删除双引号;)
SELECT s.id, user_id, body, read, s.created_at, username, name
FROM (
SELECT DISTINCT ON (user_id) *
FROM (
SELECT DISTINCT ON (user_id) id, recipient_id AS user_id, created_at
FROM messages
WHERE sender_id = 1
UNION ALL
SELECT DISTINCT ON (user_id) id, sender_id AS user_id, created_at
FROM messages
WHERE recipient_id = 1
) s
ORDER BY user_id, created_at DESC
LIMIT 20
) s
JOIN users u ON (u.id = s.user_id)
JOIN messages m ON (m.id = s.id)
答案 1 :(得分:0)
您正在联合两个巨大的查询,通过作为每个子查询的一部分的列对联合进行排序,然后仅获取前20个结果。如果您按照排序和限制联合的方式对每个子查询进行排序和限制,这很可能会更快。
这可能与性能无关,但是,当这些列都是单个值时,我没有看到排序和选择DISTINCT ON
"user_id"
列的重点(您正在搜索的用户的ID)。我错过了什么吗?
所以当有很多消息时,我觉得这样的事情要快得多:
SELECT *
FROM
(
(
SELECT
-- ...
ORDER BY "created_at" DESC
LIMIT 20
)
UNION ALL
(
SELECT
-- ...
ORDER BY "created_at" DESC
LIMIT 20
)
) AS "messages"
INNER JOIN -- ...
ORDER BY "messages"."created_at" DESC
LIMIT 20;
通过将每个子查询限制为最近的20条消息,您知道在生成的(最多)40条消息中,您拥有最近的20条消息。所有这些都可以在一个子查询中,或者在另一个子查询中,或者在每个子查询中,或者每个子查询中。