我已经读过How do I speed up counting rows in a PostgreSQL table?和https://wiki.postgresql.org/wiki/Slow_Counting,但是我并没有接近更好的结果。另一个问题中大多数答案的问题都没有任何类型的筛选,并且依赖于表范围的统计信息。
我有一个约有1000万行的表,当前数据以页为单位提取(我知道这种分页策略并不理想),并且我必须向用户显示总数(业务需求)。查询速度很快,通常小于200毫秒,如下所示:
explain analyze
SELECT DISTINCT ON (ID) table.*
FROM "table"
WHERE "table"."deleted_at" IS NULL
GROUP BY "table"."id"
ORDER BY "table"."id" DESC
LIMIT 25 OFFSET 200;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=530.48..585.91 rows=25 width=252) (actual time=0.870..0.942 rows=25 loops=1)
-> Unique (cost=87.00..19878232.36 rows=8964709 width=252) (actual time=0.328..0.899 rows=225 loops=1)
-> Group (cost=87.00..15395877.86 rows=8964709 width=252) (actual time=0.327..0.747 rows=225 loops=1)
Group Key: id
-> Index Scan Backward using table_pkey on table (cost=87.00..10913523.36 rows=8964709 width=252) (actual time=0.324..0.535 rows=225 loops=1)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 397
Planning time: 0.174 ms
Execution time: 0.986 ms
(9 rows)
Time: 77.437 ms
问题是当我尝试通过以下方式显示计数时:
explain analyze
SELECT COUNT(*) AS count_all, "table"."id" AS id
FROM "table"
WHERE "table"."deleted_at" IS NULL
GROUP BY "table"."id";
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=87.00..21194868.36 rows=10282202 width=4) (actual time=0.016..16984.904 rows=10343557 loops=1)
Group Key: id
-> Index Scan using table_pkey on table (cost=87.00..5771565.36 rows=10282202 width=4) (actual time=0.012..11435.350 rows=10343557 loops=1)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 2170
Planning time: 0.098 ms
Execution time: 18638.381 ms
(7 rows)
此刻我不能使用概率计数,但是我也无法忍受10到50秒来返回计数。我还有其他方法可以加快速度吗?