Postgres位图堆扫描或索引扫描

时间:2016-11-09 13:00:42

标签: sql postgresql bitmap query-performance

我有两张桌子:

表1:

CREATE TABLE san_material (
    id integer NOT NULL,

    detection_datetime timestamp with time zone );

CREATE INDEX san_material_detection_datetime_cb0a0120_uniq ON san_material USING btree (detection_datetime);

ALTER TABLE ONLY san_material
    ADD CONSTRAINT san_material_pkey PRIMARY KEY (id);

表2:

CREATE TABLE san_materialdata (
    id integer NOT NULL,
    material_id integer NOT NULL,
    tag_id integer NOT NULL,

);
ALTER TABLE ONLY san_materialdata ALTER COLUMN material_id SET STATISTICS 10000;
ALTER TABLE ONLY san_materialdata ALTER COLUMN tag_id SET STATISTICS 10000;

ALTER TABLE ONLY san_materialdata
ADD CONSTRAINT san_materialdata_pkey PRIMARY KEY (id);

CREATE INDEX san_materialdata_76f094bc ON san_materialdata USING btree (tag_id);

CREATE INDEX san_materialdata_eb4b9aaa ON san_materialdata USING btree (material_id);

CREATE INDEX san_materialdata_material_id_32150950_idx ON san_materialdata USING btree (material_id, tag_id);

ALTER TABLE ONLY san_materialdata
    ADD CONSTRAINT san_materialdata_material_id_5e01090e_fk_san_material_id FOREIGN KEY (material_id) REFERENCES san_material(id) DEFERRABLE INITIALLY DEFERRED;

我的查询有问题:

查询计划取决于detection_datetime间隔。例如,如果是

SELECT COUNT("san_material"."id") AS "__count" FROM "san_material"  WHERE ("san_material"."detection_datetime" BETWEEN '2016-11-01T00:00:00+03:00'::timestamptz AND '2016-11-01T23:59:59.999999+03:00'::timestamptz  AND "san_material"."id" IN (SELECT U0."material_id" FROM "san_materialdata" U0 WHERE (U0."tag_id" IN (602))))

它使用索引扫描计划:

"Aggregate  (cost=670656.06..670656.07 rows=1 width=4)"
"  ->  Nested Loop Semi Join  (cost=1.00..670645.44 rows=4250 width=4)"
"        ->  Index Scan using san_material_detection_datetime_cb0a0120_uniq on san_material  (cost=0.43..83416.26 rows=143242 width=4)"
"              Index Cond: ((detection_datetime >= '2016-11-01 00:00:00+03'::timestamp with time zone) AND (detection_datetime <= '2016-11-01 23:59:59.999999+03'::timestamp with time zone))"
"        ->  Index Only Scan using san_materialdata_material_id_32150950_idx on san_materialdata u0  (cost=0.57..4.10 rows=1 width=4)"
"              Index Cond: ((material_id = san_material.id) AND (tag_id = 602))"

但如果我将日期增加到2天:

SELECT COUNT("san_material"."id") AS "__count" FROM "san_material"  WHERE ("san_material"."detection_datetime" BETWEEN '2016-11-01T00:00:00+03:00'::timestamptz AND '2016-11-02T23:59:59.999999+03:00'::timestamptz  AND "san_material"."id" IN (SELECT U0."material_id" FROM "san_materialdata" U0 WHERE (U0."tag_id" IN (602))))

它使用带有位图堆扫描的计划:

"Aggregate  (cost=700712.88..700712.89 rows=1 width=4)"
"  ->  Nested Loop  (cost=691124.49..700692.41 rows=8191 width=4)"
"        ->  HashAggregate  (cost=691124.06..691135.46 rows=1140 width=4)"
"              Group Key: u0.material_id"
"              ->  Bitmap Heap Scan on san_materialdata u0  (cost=5041.54..690483.74 rows=256125 width=4)"
"                    Recheck Cond: (tag_id = 602)"
"                    ->  Bitmap Index Scan on san_materialdata_76f094bc  (cost=0.00..4977.51 rows=256125 width=0)"
"                          Index Cond: (tag_id = 602)"
"        ->  Index Scan using san_material_pkey on san_material  (cost=0.43..8.37 rows=1 width=4)"
"              Index Cond: (id = u0.material_id)"
"              Filter: ((detection_datetime >= '2016-11-01 00:00:00+03'::timestamp with time zone) AND (detection_datetime <= '2016-11-02 23:59:59.999999+03'::timestamp with time zone))"

我增加了字段material_id和tag_id的统计信息,但它是相同的。 我该如何解决?

P.S。粘贴http://dpaste.com/3B823J7

0 个答案:

没有答案