我正在尝试获取属于用户所有对话的所有标签(用户通过ConversationUserPair
联接有很多对话)-但查询平均需要2,000毫秒。
SELECT "tags"."tag_text_downcased"
FROM "tags"
INNER JOIN "conversations" ON "tags"."conversation_id" = "conversations"."id"
INNER JOIN "conversation_user_pairs" ON "conversations"."id" = "conversation_user_pairs"."conversation_id"
WHERE "conversation_user_pairs"."user_id" = ?
AND "conversation_user_pairs"."conversation_status" = ?
AND ("tags"."user_id" = ?);
当我在psql控制台中运行EXPLAIN ANALYZE
时,这是我得到的响应:
EXPLAIN ANALYZE
SELECT "tags"."tag_text_downcased" FROM "tags" INNER JOIN "conversations" ON "tags"."conversation_id" = "conversations"."id" INNER JOIN "conversation_user_pairs" ON "conversations"."id" = "conversation_user_pairs"."conversation_id" WHERE "conversation_user_pairs"."user_id" = '459' AND "conversation_user_pairs"."conversation_status" = 'active' AND ("tags"."user_id" = '459');
Nested Loop (cost=462.87..486.65 rows=1 width=11) (actual time=0.457..1.886 rows=40 loops=1)
Join Filter: (tags.conversation_id = conversations.id)
-> Merge Join (cost=462.78..482.97 rows=1 width=19) (actual time=0.401..1.334 rows=40 loops=1)
Merge Cond: (tags.conversation_id = conversation_user_pairs.conversation_id)
-> Sort (cost=462.70..462.83 rows=259 width=15) (actual time=0.332..0.337 rows=40 loops=1)
Sort Key: tags.conversation_id
Sort Method: quicksort Memory: 27kB
-> Bitmap Heap Scan on tags (cost=4.49..460.62 rows=259 width=15) (actual time=0.152..0.295 rows=40 loops=1)
Recheck Cond: (user_id = 459)
Heap Blocks: exact=23
-> Bitmap Index Scan on index_tags_on_user_id_and_conversation_id (cost=0.00..4.47 rows=259 width=0) (actual time=0.105..0.105 rows=40 loops=1)
Index Cond: (user_id = 459)
-> Index Only Scan using by_user_and_conversation_and_status on conversation_user_pairs (cost=0.08..20.02 rows=522 width=4) (actual time=0.066..0.956 rows=390 loops=1)
Index Cond: ((user_id = 459) AND (conversation_status = 'active'::text))
Heap Fetches: 134
-> Index Only Scan using index_conversations_on_id on conversations (cost=0.08..3.68 rows=1 width=4) (actual time=0.013..0.013 rows=1 loops=40)
Index Cond: (id = conversation_user_pairs.conversation_id)
Heap Fetches: 40
我认为我在所讨论的三个单独的表上都有适当的索引。我有:
add_index "tags", ["conversation_id", "user_id", "tag_text_downcased"], name: "find_tag_text_downcased_tags"
add_index "tags", ["conversation_id", "user_id"], name: "index_conversation_first_tags"
add_index "tags", ["user_id", "conversation_id"], name: "index_tags_on_user_id_and_conversation_id"
add_index "conversation_user_pairs", ["user_id", "conversation_id", "conversation_status"], name: "by_user_and_conversation_and_status"
add_index "conversations", ["id"], name: "index_conversations_on_id"
这里似乎没有使用任何表中的索引来加快查询速度吗?还是有办法拥有多表索引?
答案 0 :(得分:0)
我要在缺乏信息的情况下进行有根据的猜测...
您显示的查询不适合您声明的目标:
我正在尝试获取属于用户所有对话的所有标签
我假设“全部”是指“任何”。
还假设引用完整性是通过外键约束来强制实施的。然后我们可以切掉中间人conversations
。加入它只会增加成本。
通过这种方式,查询可以多次返回相同的标签。假设您需要唯一的标记,则足以断言conversation_user_pairs
中存在的所有匹配行。 EXISTS
半联接通常是实现此目的的最佳方法:
SELECT t.tag_text_downcased
FROM tags t
WHERE t.user_id = 459 -- assuming it's a numeric data type
AND EXISTS (
SELECT
FROM conversation_user_pairs cu
WHERE cu.user_id = t.user_id
AND cu.conversation_id = t.conversation_id
AND cu.conversation_status = 'active'
);
您在find_tag_text_downcased_tags
上的索引tags
很完美。
by_user_and_conversation_and_status
也很适合。如果许多行不是“活动的”,则尽管您对活动的行最感兴趣,但局部索引甚至可以更好:
CREATE INDEX ON conversation_user_pairs (user_id, conversation_id)
WHERE conversation_status = 'active';
您在这里不需要其他索引。既然您有两个:
add_index "tags", ["conversation_id", "user_id", "tag_text_downcased"], name: "find_tag_text_downcased_tags"
add_index "tags", ["user_id", "conversation_id"], name: "index_tags_on_user_id_and_conversation_id"
...保留该标记通常也没有用:
add_index "tags", ["conversation_id", "user_id"], name: "index_conversation_first_tags"
您可以删除它。参见:
在旁边:如果conversation_status
仅具有'active'和'dead'或类似内容,则将其设为boolean
列。比text
更小,更便宜。