部分索引未生效

时间:2016-07-03 11:54:18

标签: postgresql postgresql-9.4 postgresql-9.5

当我在\d+ call_records; id | integer | not null default nextval('call_records_id_seq'::regclass) | plain | | plain_crn | bigint | active | boolean | default true timestamp | bigint | default 0 Indexes: "index_call_records_on_plain_crn" UNIQUE, btree (plain_crn) "index_call_records_on_active" btree (active) WHERE active = true 命令中看到部分索引时,为什么会收到id;

EXPLAIN select * from call_records where id=1;
                                       QUERY PLAN                                       
----------------------------------------------------------------------------------------
 Index Scan using call_records_pkey on call_records  (cost=0.14..8.16 rows=1 width=373)
   Index Cond: (id = 1)
(2 rows)

正如EXPLAIN select * from call_records where plain_crn=1; QUERY PLAN ------------------------------------------------------------------------------------------------------ Index Scan using index_call_records_on_plain_crn on call_records (cost=0.14..8.16 rows=1 width=373) Index Cond: (plain_crn = 1) (2 rows) 所期望的那样是一个索引扫描。

active

同样适用于plain_crn

EXPLAIN select * from call_records where active=true;                                                                                                                         QUERY PLAN                           
-----------------------------------------------------------------
 Seq Scan on call_records  (cost=0.00..12.00 rows=100 width=373)
   Filter: active
(2 rows)

但是,x <- test_xml_parse[['/soapenv:Envelope/soapenv:Body/obs:createObservations/*/xsd:applicationId']] xmlValue(x) # [1] "1000" xmlValue(x) <- 1 xmlValue(x) # [1] "1" 的情况不一样。

ns <- test_xml_parse['/soapenv:Envelope/soapenv:Body/obs:createObservations/*/xsd:applicationId']
vals <- c(1)
for (x in seq_along(ns))
  xmlValue(ns[[x]]) <- vals[x] 

2 个答案:

答案 0 :(得分:3)

PostgreSQL是否在“active”上使用索引取决于true与false的比率。在某些情况下,如果有更多的是真的,那么查询计划程序将决定表扫描可能会更快。

我构建了一个表来测试,并加载了一百万行随机(ish)数据。

select active, count(*)
from call_records
group by active;
active  count
--
f       499983
t       500017

真和假的行数大致相同。这是执行计划。

explain analyze 
select * from call_records where active=true;
"Bitmap Heap Scan on call_records  (cost=5484.82..15344.49 rows=500567 width=21) (actual time=56.542..172.084 rows=500017 loops=1)"
"  Filter: active"
"  Heap Blocks: exact=7354"
"  ->  Bitmap Index Scan on call_records_active_idx  (cost=0.00..5359.67 rows=250567 width=0) (actual time=55.040..55.040 rows=500023 loops=1)"
"        Index Cond: (active = true)"
"Planning time: 0.105 ms"
"Execution time: 204.209 ms"

然后我更新了“活动”,更新了统计信息,然后再次检查。

update call_records
set active = true
where id < 750000;

analyze call_records;
explain analyze 
select * from call_records where active=true;
"Seq Scan on call_records  (cost=0.00..22868.00 rows=874100 width=21) (actual time=0.032..280.506 rows=874780 loops=1)"
"  Filter: active"
"  Rows Removed by Filter: 125220"
"Planning time: 0.316 ms"
"Execution time: 337.400 ms"

关闭顺序扫描显示,在我的情况下,PostgreSQL做出了正确的决定。表扫描(顺序扫描)快了大约10毫秒。

set enable_seqscan = off;
explain analyze 
select * from call_records where active=true;
"Index Scan using call_records_active_idx on call_records  (cost=0.42..39071.14 rows=874100 width=21) (actual time=0.031..293.295 rows=874780 loops=1)"
"  Index Cond: (active = true)"
"Planning time: 0.343 ms"
"Execution time: 349.403 ms"

答案 1 :(得分:2)

您应该从测试索引扫描的成本开始

SET enable_seqscan = OFF;

你会发现它远远高于seqscan。您的表中的总行数可能非常低。由于您选择* Postgres仍然需要查找每一行,因此对所有行执行顺序扫描要比检查索引更容易,然后必须获取大部分页面。