我有以下查询:
SELECT "person_dimensions"."dimension"
FROM "person_dimensions"
join users
on users.id = person_dimensions.user_id
where users.team_id = 2
以下是EXPLAIN ANALYZE的结果:
Nested Loop (cost=0.43..93033.84 rows=452 width=11) (actual time=1245.321..42915.426 rows=827 loops=1)
-> Seq Scan on person_dimensions (cost=0.00..254.72 rows=13772 width=15) (actual time=0.022..9.907 rows=13772 loops=1)
-> Index Scan using users_pkey on users (cost=0.43..6.73 rows=1 width=4) (actual time=2.978..3.114 rows=0 loops=13772)
Index Cond: (id = person_dimensions.user_id)
Filter: (team_id = 2)
Rows Removed by Filter: 1
Planning time: 0.396 ms
Execution time: 42915.678 ms
person_dimensions.user_id和users.team_id上存在索引,因此不清楚为什么这个看似简单的查询需要这么长时间。
也许这与team_id无法在连接条件中使用有关?想法如何加快速度?
编辑:
我尝试了这个查询:
SELECT "person_dimensions"."dimension"
FROM "person_dimensions"
join users ON users.id = person_dimensions.user_id
WHERE users.id IN (2337,2654,3501,56,4373,1060,3170,97,4629,41,3175,4541,2827)
包含子查询返回的id:
SELECT id FROM users WHERE team_id = 2
结果是380ms,而上述时间为42s。我可以使用它作为一种解决方法,但我真的很好奇这里发生了什么......
答案 0 :(得分:0)
我昨天重新启动了我的数据库服务器,当它恢复时,同样的查询按预期执行,使用完全不同的查询计划使用了预期的索引:
QUERY PLAN
Hash Join (cost=1135.63..1443.45 rows=84 width=11) (actual time=0.354..6.312 rows=835 loops=1)
Hash Cond: (person_dimensions.user_id = users.id)
-> Seq Scan on person_dimensions (cost=0.00..255.17 rows=13817 width=15) (actual time=0.002..2.764 rows=13902 loops=1)
-> Hash (cost=1132.96..1132.96 rows=214 width=4) (actual time=0.175..0.175 rows=60 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Bitmap Heap Scan on users (cost=286.07..1132.96 rows=214 width=4) (actual time=0.032..0.157 rows=60 loops=1)
Recheck Cond: (team_id = 2)
Heap Blocks: exact=68
-> Bitmap Index Scan on index_users_on_team_id (cost=0.00..286.02 rows=214 width=0) (actual time=0.021..0.021 rows=82 loops=1)
Index Cond: (team_id = 2)
Planning time: 0.215 ms
Execution time: 6.474 ms
任何人都有任何想法为什么需要重新启动才能知道所有这些?是不是需要手动真空吸尘器一段时间没有完成,或类似的东西?回想一下,我在重新启动之前对相关表进行了分析,并没有改变任何内容。