Question

我有一张包含1200万条记录的表格。列col_1和col_2上有索引。我使用postgresql 9.3。我需要两种类型的查询。首先，在where子句中只有一个条件的一些查询，例如：

select count(*) 
from table_1
where 
   col_1 >= 123456;

解释分析： @CraigRinger

Aggregate  (cost=164523.60..164523.61 rows=1 width=0) (actual time=1803.281..1803.281 rows=1 loops=1)
->  Index Only Scan using table1_col1_idx on table_1  (cost=0.43..151242.20 rows=5312558 width=0) (actual time=60.713..1344.393 rows=5318333 loops=1)
     Index Cond: (col_1 >= 123456)
     Heap Fetches: 0
 Total runtime: 1803.330 ms

和另外一个查询：

select count(*) 
from table_1
where
    col_2 >= 987654;

解释分析：

Aggregate  (cost=364134.66..364134.67 rows=1 width=0) (actual time=3935.708..3935.708 rows=1 loops=1)
->  Index Only Scan using table1_col2_idx on table_1  (cost=0.43..334739.38 rows=11758111 width=0) (actual time=7.521..2904.569 rows=11760285 loops=1)
     Index Cond: (col_2 >= 987654)
     Heap Fetches: 0
Total runtime: 3935.760 ms

但是，问题是组合where子句的大量运行时间：当两个或多个条件与AND / OR结合时。例如：

select count(*)
from table_1
where
    col_1 >= 123456; AND col_2 >= 987654;

解释分析：

 ->  Seq Scan on table_1  (cost=0.00..650822.93 rows=5295377 width=0) (actual time=0.056..45445.711 rows=5301622 loops=1)
     Filter: ((col_2 >= 987654) AND (col_1 >= 123456))
     Rows Removed by Filter: 6494640
Total runtime: 45961.622 ms

这是不可接受的：3秒对45秒！那么，是否有任何解决方案可以改善这种组合查询？如何修改此查询以强制计划程序在col_1和col_2上使用索引？

我也尝试过： set enable_seqscan = false;

然后，规划人员将其搜索计划修改为位图扫描;这导致运行时间= 137秒!!!

Aggregate  (cost=666246.28..666246.29 rows=1 width=0) (actual time=137311.964..137311.964 rows=1 loops=1)
->  Bitmap Heap Scan on table_1  (cost=99440.46..653007.83 rows=5295377 width=0) (actual time=1105.153..136527.723 rows=5301622 loops=1)
     Recheck Cond: (col_1 >= 123456)
     Filter: (col_2 >= 987654)
     Rows Removed by Filter: 16711
     ->  Bitmap Index Scan on table1_col1_idx  (cost=0.00..98116.62 rows=5312558 width=0) (actual time=862.677..862.677 rows=5318333 loops=1)
           Index Cond: (col_1 >= 123456)
Total runtime: 137314.450 ms

为什么有两个条件的sql查询需要很长时间？

0 个答案: