我是一名postgres新手。 我们的应用程序有一个类似的查询。
select count(distinct pk_column) from table;
explain analyze select count(distinct pk_column) from table;
QUERY PLAN
Aggregate (cost=35797.30..35797.31 rows=1 width=8) (actual time=1251.631..1251.632 rows=1 loops=1)
-> Seq Scan on table (cost=0.00..34606.24 rows=476424 width=8) (actual time=0.006..420.212 rows=477889 loops=1)
Total runtime: 1251.676 ms
Query performance improves when distinct clause on primary key column is removed.
explain analyze select count(pk_column) from table;
QUERY PLAN
Aggregate (cost=35797.30..35797.31 rows=1 width=8) (actual time=817.994..817.995 rows=1 loops=1)
-> Seq Scan on table (cost=0.00..34606.24 rows=476424 width=8) (actual time=0.006..434.674 rows=477890 loops=1)
Total runtime: 818.040 ms
不改变查询,是否可以告诉postgres忽略主键列上的distinct子句?
Version: PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu
,由gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
答案 0 :(得分:1)
您的测试方法似乎存在缺陷。这些是相同的查询计划,并且没有为DISTINCT
添加额外的计划节点。
我认为你只是看到了缓存效果。重复几次,他们可能会开始相似。
这实际上是slow counting problem的一种变体,它通过添加index-only scans在PostgreSQL 9.2中得到了很大的解决。考虑升级。
感谢您从一开始就包含您的版本详细信息和explain analyze
输出。
答案 1 :(得分:1)
我可以在几次运行后在9.2中重现它:
create table t (pk_column serial primary key);
insert into t
select generate_series(1, 477890);
analyze t;
explain analyze select count(distinct pk_column) from t;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Aggregate (cost=8088.63..8088.64 rows=1 width=4) (actual time=210.755..210.755 rows=1 loops=1)
-> Seq Scan on t (cost=0.00..6893.90 rows=477890 width=4) (actual time=0.012..28.018 rows=477890 loops=1)
Total runtime: 210.790 ms
explain analyze select count(pk_column) from t;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Aggregate (cost=8088.63..8088.64 rows=1 width=4) (actual time=51.892..51.892 rows=1 loops=1)
-> Seq Scan on t (cost=0.00..6893.90 rows=477890 width=4) (actual time=0.005..24.541 rows=477890 loops=1)
Total runtime: 51.911 ms
答案 2 :(得分:0)
您确定您的PK列实际上是主键吗?具有不同的EXPLAIN ANALYZE显示477889行,而没有distinct的EXPLAIN ANALYZE显示477890.