Postgres查询优化与简单连接

时间:2017-03-20 18:38:22

标签: postgresql query-optimization

我有以下查询:

SELECT "person_dimensions"."dimension" 
FROM   "person_dimensions" 
join   users 
on     users.id = person_dimensions.user_id 
where  users.team_id = 2

以下是EXPLAIN ANALYZE的结果:

Nested Loop  (cost=0.43..93033.84 rows=452 width=11) (actual time=1245.321..42915.426 rows=827 loops=1)
      ->  Seq Scan on person_dimensions  (cost=0.00..254.72 rows=13772 width=15) (actual time=0.022..9.907 rows=13772 loops=1)
      ->  Index Scan using users_pkey on users  (cost=0.43..6.73 rows=1 width=4) (actual time=2.978..3.114 rows=0 loops=13772)
            Index Cond: (id = person_dimensions.user_id)
            Filter: (team_id = 2)
            Rows Removed by Filter: 1
Planning time: 0.396 ms
Execution time: 42915.678 ms

person_dimensions.user_id和users.team_id上存在索引,因此不清楚为什么这个看似简单的查询需要这么长时间。

也许这与team_id无法在连接条件中使用有关?想法如何加快速度?

编辑:

我尝试了这个查询:

SELECT "person_dimensions"."dimension" 
FROM "person_dimensions"
join users ON users.id = person_dimensions.user_id 
WHERE users.id IN (2337,2654,3501,56,4373,1060,3170,97,4629,41,3175,4541,2827)

包含子查询返回的id:

SELECT id FROM users WHERE team_id = 2

结果是380ms,而上述时间为42s。我可以使用它作为一种解决方法,但我真的很好奇这里发生了什么......

1 个答案:

答案 0 :(得分:0)

我昨天重新启动了我的数据库服务器,当它恢复时,同样的查询按预期执行,使用完全不同的查询计划使用了预期的索引:

QUERY PLAN
Hash Join  (cost=1135.63..1443.45 rows=84 width=11) (actual time=0.354..6.312 rows=835 loops=1)
  Hash Cond: (person_dimensions.user_id = users.id)
  ->  Seq Scan on person_dimensions  (cost=0.00..255.17 rows=13817 width=15) (actual time=0.002..2.764 rows=13902 loops=1)
  ->  Hash  (cost=1132.96..1132.96 rows=214 width=4) (actual time=0.175..0.175 rows=60 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 11kB
        ->  Bitmap Heap Scan on users  (cost=286.07..1132.96 rows=214 width=4) (actual time=0.032..0.157 rows=60 loops=1)
              Recheck Cond: (team_id = 2)
              Heap Blocks: exact=68
              ->  Bitmap Index Scan on index_users_on_team_id  (cost=0.00..286.02 rows=214 width=0) (actual time=0.021..0.021 rows=82 loops=1)
                    Index Cond: (team_id = 2)
Planning time: 0.215 ms
Execution time: 6.474 ms

任何人都有任何想法为什么需要重新启动才能知道所有这些?是不是需要手动真空吸尘器一段时间没有完成,或类似的东西?回想一下,我在重新启动之前对相关表进行了分析,并没有改变任何内容。