Question

我有以下查询：

SELECT "person_dimensions"."dimension" 
FROM   "person_dimensions" 
join   users 
on     users.id = person_dimensions.user_id 
where  users.team_id = 2

以下是EXPLAIN ANALYZE的结果：

Nested Loop  (cost=0.43..93033.84 rows=452 width=11) (actual time=1245.321..42915.426 rows=827 loops=1)
      ->  Seq Scan on person_dimensions  (cost=0.00..254.72 rows=13772 width=15) (actual time=0.022..9.907 rows=13772 loops=1)
      ->  Index Scan using users_pkey on users  (cost=0.43..6.73 rows=1 width=4) (actual time=2.978..3.114 rows=0 loops=13772)
            Index Cond: (id = person_dimensions.user_id)
            Filter: (team_id = 2)
            Rows Removed by Filter: 1
Planning time: 0.396 ms
Execution time: 42915.678 ms

person_dimensions.user_id和users.team_id上存在索引，因此不清楚为什么这个看似简单的查询需要这么长时间。

也许这与team_id无法在连接条件中使用有关？想法如何加快速度？

编辑：

我尝试了这个查询：

SELECT "person_dimensions"."dimension" 
FROM "person_dimensions"
join users ON users.id = person_dimensions.user_id 
WHERE users.id IN (2337,2654,3501,56,4373,1060,3170,97,4629,41,3175,4541,2827)

包含子查询返回的id：

SELECT id FROM users WHERE team_id = 2

结果是380ms，而上述时间为42s。我可以使用它作为一种解决方法，但我真的很好奇这里发生了什么......

Answer 1

我昨天重新启动了我的数据库服务器，当它恢复时，同样的查询按预期执行，使用完全不同的查询计划使用了预期的索引：

QUERY PLAN
Hash Join  (cost=1135.63..1443.45 rows=84 width=11) (actual time=0.354..6.312 rows=835 loops=1)
  Hash Cond: (person_dimensions.user_id = users.id)
  ->  Seq Scan on person_dimensions  (cost=0.00..255.17 rows=13817 width=15) (actual time=0.002..2.764 rows=13902 loops=1)
  ->  Hash  (cost=1132.96..1132.96 rows=214 width=4) (actual time=0.175..0.175 rows=60 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 11kB
        ->  Bitmap Heap Scan on users  (cost=286.07..1132.96 rows=214 width=4) (actual time=0.032..0.157 rows=60 loops=1)
              Recheck Cond: (team_id = 2)
              Heap Blocks: exact=68
              ->  Bitmap Index Scan on index_users_on_team_id  (cost=0.00..286.02 rows=214 width=0) (actual time=0.021..0.021 rows=82 loops=1)
                    Index Cond: (team_id = 2)
Planning time: 0.215 ms
Execution time: 6.474 ms

任何人都有任何想法为什么需要重新启动才能知道所有这些？是不是需要手动真空吸尘器一段时间没有完成，或类似的东西？回想一下，我在重新启动之前对相关表进行了分析，并没有改变任何内容。

Postgres查询优化与简单连接

1 个答案: