我有一张包含1200万条记录的表格。列col_1和col_2上有索引。我使用postgresql 9.3。 我需要两种类型的查询。首先,在where子句中只有一个条件的一些查询,例如:
select count(*)
from table_1
where
col_1 >= 123456;
解释分析: @CraigRinger
Aggregate (cost=164523.60..164523.61 rows=1 width=0) (actual time=1803.281..1803.281 rows=1 loops=1)
-> Index Only Scan using table1_col1_idx on table_1 (cost=0.43..151242.20 rows=5312558 width=0) (actual time=60.713..1344.393 rows=5318333 loops=1)
Index Cond: (col_1 >= 123456)
Heap Fetches: 0
Total runtime: 1803.330 ms
和另外一个查询:
select count(*)
from table_1
where
col_2 >= 987654;
解释分析:
Aggregate (cost=364134.66..364134.67 rows=1 width=0) (actual time=3935.708..3935.708 rows=1 loops=1)
-> Index Only Scan using table1_col2_idx on table_1 (cost=0.43..334739.38 rows=11758111 width=0) (actual time=7.521..2904.569 rows=11760285 loops=1)
Index Cond: (col_2 >= 987654)
Heap Fetches: 0
Total runtime: 3935.760 ms
但是,问题是组合where子句的大量运行时间:当两个或多个条件与AND / OR结合时。例如:
select count(*)
from table_1
where
col_1 >= 123456; AND col_2 >= 987654;
解释分析:
-> Seq Scan on table_1 (cost=0.00..650822.93 rows=5295377 width=0) (actual time=0.056..45445.711 rows=5301622 loops=1)
Filter: ((col_2 >= 987654) AND (col_1 >= 123456))
Rows Removed by Filter: 6494640
Total runtime: 45961.622 ms
这是不可接受的:3秒对45秒!那么,是否有任何解决方案可以改善这种组合查询?如何修改此查询以强制计划程序在col_1和col_2上使用索引?
我也尝试过: set enable_seqscan = false;
然后,规划人员将其搜索计划修改为位图扫描;这导致运行时间= 137秒!!!
Aggregate (cost=666246.28..666246.29 rows=1 width=0) (actual time=137311.964..137311.964 rows=1 loops=1)
-> Bitmap Heap Scan on table_1 (cost=99440.46..653007.83 rows=5295377 width=0) (actual time=1105.153..136527.723 rows=5301622 loops=1)
Recheck Cond: (col_1 >= 123456)
Filter: (col_2 >= 987654)
Rows Removed by Filter: 16711
-> Bitmap Index Scan on table1_col1_idx (cost=0.00..98116.62 rows=5312558 width=0) (actual time=862.677..862.677 rows=5318333 loops=1)
Index Cond: (col_1 >= 123456)
Total runtime: 137314.450 ms