具有复合主键的表的基数不佳

时间:2013-09-26 15:29:42

标签: sql postgresql cardinality sql-execution-plan

当通过复合(两列)主键连接两个表时,我在查询计划中得到错误的基数估计。例如:

CREATE TABLE t1 AS SELECT x, x*2 AS x2 FROM generate_series(0, 1000) AS x;
ALTER TABLE t1 ADD PRIMARY KEY(x, x2);
ANALYZE t1;

CREATE TABLE t2 AS SELECT x, x*2 AS x2 FROM generate_series(0, 1000) AS x;
ALTER TABLE t2 ADD FOREIGN KEY (x, x2) REFERENCES t1(x,x2);
ANALYZE t2;

EXPLAIN ANALYZE
SELECT *
FROM t1 JOIN t2 USING (x, x2)

 QUERY PLAN                                                                                                    
 ------------------------------------------------------------------------------------------------------------- 
 Hash Join  (cost=30.02..52.55 rows=1 width=8) (actual time=0.660..1.551 rows=1001 loops=1)                    
   Hash Cond: ((t1.x = t2.x) AND (t1.x2 = t2.x2))                                                              
   ->  Seq Scan on t1  (cost=0.00..15.01 rows=1001 width=8) (actual time=0.021..0.260 rows=1001 loops=1)       
   ->  Hash  (cost=15.01..15.01 rows=1001 width=8) (actual time=0.620..0.620 rows=1001 loops=1)                
         Buckets: 1024  Batches: 1  Memory Usage: 40kB                                                         
         ->  Seq Scan on t2  (cost=0.00..15.01 rows=1001 width=8) (actual time=0.019..0.230 rows=1001 loops=1) 
 Total runtime: 1.679 ms    

该计划需要一个返回的行,但实际上返回了1001行。这在简单查询中不是问题,但在执行复杂查询时会导致查询计划非常慢。如何帮助查询优化器做得更好?

2 个答案:

答案 0 :(得分:1)

使用复合主键,其中一列完全依赖于另一列,这是一种“有趣”的设计。

在任何情况下,PostgreSQL当前都假设每个列的选择性彼此独立,因此将它们相乘(无论它们是否在同一个索引中,即使它是主键索引),我也不会我知道一个很好的方法。

您可以使用此回避来更接近真正的选择性:

EXPLAIN ANALYZE
SELECT *
FROM t1 JOIN t2 on (t1.x=t2.x and t1.x2 between t2.x2 and t2.x2);

答案 1 :(得分:0)

创建真正正交键元素的另一种方法:

CREATE TABLE t1 AS SELECT x/100 AS x, x%100 AS x2 FROM generate_series(0, 10000) AS x;
ALTER TABLE t1 ADD PRIMARY KEY(x, x2);
ANALYZE t1;

CREATE TABLE t2 AS SELECT x/100 AS x, x%100 AS x2 FROM generate_series(0, 10000) AS x;
ALTER TABLE t2 ADD PRIMARY KEY (x, x2) ; -- added PK
ALTER TABLE t2 ADD FOREIGN KEY (x, x2) REFERENCES t1(x,x2);

ANALYZE t2;