Postgres使用较慢的索引和小限制

时间:2015-12-15 14:53:24

标签: ruby-on-rails performance postgresql postgresql-performance

我有一个有趣的难题。

我有一些不同的查询在某些情况下显着减慢。

这个很快:

SELECT
  "posts".*
FROM "posts"
WHERE "posts"."source_id" IN (29949, 29952, 29950, 33642, 33626, 33627, 33625)
AND "posts"."deleted_at" IS NULL
AND "posts"."rejected_at" IS NULL
ORDER BY POSITION ASC, external_created_at DESC;
LIMIT 100
OFFSET 0

这个很慢:

SELECT
  "posts".*
FROM "posts"
WHERE "posts"."source_id" IN (29949, 29952, 29950, 33642, 33626, 33627, 33625)
AND "posts"."deleted_at" IS NULL
AND "posts"."rejected_at" IS NULL
ORDER BY POSITION ASC, external_created_at DESC;
LIMIT 5
OFFSET 0

唯一的区别是限制。

最奇怪的部分是对#2的非常相似的查询很快:

SELECT
  "posts".*
FROM "posts"
WHERE "posts"."source_id" IN (5868, 5867)
AND "posts"."deleted_at" IS NULL
AND "posts"."rejected_at" IS NULL
ORDER BY POSITION ASC, external_created_at DESC;
LIMIT 100
OFFSET 0

只是在查看较小范围的source_ids

以下是所有三个的查询计划:

EXPLAIN ANALYZE SELECT  "posts".* FROM "posts"  WHERE "posts"."source_id" IN (29949, 29952, 29950, 33642, 33626, 33627, 33625) AND "posts"."deleted_at" IS NULL AND "posts"."rejected_at" IS NULL  ORDER BY POSITION ASC, external_created_at DESC LIMIT 100 OFFSET 0;
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=36900.88..36901.13 rows=100 width=1051) (actual time=104.564..104.570 rows=28 loops=1)
   ->  Sort  (cost=36900.88..36926.01 rows=10052 width=1051) (actual time=104.559..104.563 rows=28 loops=1)
         Sort Key: "position", external_created_at
         Sort Method: quicksort  Memory: 53kB
         ->  Index Scan using index_posts_on_source_id on posts  (cost=0.44..36516.70 rows=10052 width=1051) (actual time=9.724..102.885 rows=28 loops=1)
               Index Cond: (source_id = ANY ('{29949,29952,29950,33642,33626,33627,33625}'::integer[]))
               Filter: ((deleted_at IS NULL) AND (rejected_at IS NULL))
               Rows Removed by Filter: 1737
 Total runtime: 105.774 ms


EXPLAIN ANALYZE SELECT  "posts".* FROM "posts"  WHERE "posts"."source_id" IN (29949, 29952, 29950, 33642, 33626, 33627, 33625) AND "posts"."deleted_at" IS NULL AND "posts"."rejected_at" IS NULL  ORDER BY POSITION ASC, external_created_at DESC LIMIT 5 OFFSET 0;
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..18788.72 rows=5 width=1051) (actual time=79611.044..314266.666 rows=5 loops=1)
   ->  Index Scan using index_posts_on_position_and_external_created_at on posts  (cost=0.56..37771717.36 rows=10052 width=1051) (actual time=79610.677..314266.292 rows=5 loops=1)
         Filter: ((deleted_at IS NULL) AND (rejected_at IS NULL) AND (source_id = ANY ('{29949,29952,29950,33642,33626,33627,33625}'::integer[])))
         Rows Removed by Filter: 3665332
 Total runtime: 314269.266 ms


EXPLAIN ANALYZE SELECT  "posts".* FROM "posts"  WHERE "posts"."source_id" IN (5868, 5867) AND "posts"."deleted_at" IS NULL AND "posts"."rejected_at" IS NULL ORDER BY POSITION ASC, external_created_at DESC LIMIT 100 OFFSET 0;
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10587.37..10587.62 rows=100 width=1051) (actual time=1017.476..1017.498 rows=100 loops=1)
   ->  Sort  (cost=10587.37..10594.55 rows=2872 width=1051) (actual time=1017.474..1017.483 rows=100 loops=1)
         Sort Key: "position", external_created_at
         Sort Method: top-N heapsort  Memory: 112kB
         ->  Index Scan using index_posts_on_source_id on posts  (cost=0.44..10477.60 rows=2872 width=1051) (actual time=2.823..999.417 rows=4334 loops=1)
               Index Cond: (source_id = ANY ('{5868,5867}'::integer[]))
               Filter: ((deleted_at IS NULL) AND (rejected_at IS NULL))
               Rows Removed by Filter: 39
 Total runtime: 1017.669 ms

以下是我的索引定义:

"posts_pkey" PRIMARY KEY, btree (id)
"index_posts_on_deleted_at" btree (deleted_at)
"index_posts_on_external_created_at" btree (external_created_at)
"index_posts_on_external_id" btree (external_id)
"index_posts_on_position" btree ("position")
"index_posts_on_position_and_external_created_at" btree ("position", external_created_at DESC)
"index_posts_on_rejected_at" btree (rejected_at)
"index_posts_on_source_id" btree (source_id)

我正在运行Postgres版本:9.3.4

当其他两个人使用Index Scan using index_posts_on_position_and_external_created_at on posts时,为什么使用Index Scan using index_posts_on_source_id on posts的速度较慢?我该如何解决?

1 个答案:

答案 0 :(得分:1)

有点迟到的答案,如果你还有这个问题,为什么不简单地放弃index_posts_on_position_and_external_created_at?正如您所说,查询计划程序使用此特定索引时会出现问题。

您已经拥有以下两个索引:

"index_posts_on_external_created_at" btree (external_created_at)
"index_posts_on_position" btree ("position")

这两个使得index_posts_on_position_and_external_created_at非常冗余,因为postgresql可以在给定查询上使用多个索引。如果您担心排序性能,可以向index_posts_on_external_created_at

添加排序