我正在使用Postgres 9.4。我有一张大桌子。这是我桌子的结构:
processing_date | date |
practice_id | character varying(6) |
chemical_id | character varying(9) |
items | bigint |
cost | double precision |
Indexes:
"vw_idx_chem_by_practice_chem_id" btree (chemical_id)
"vw_idx_chem_by_practice_chem_id_vc" btree (chemical_id varchar_pattern_ops)
"vw_idx_chem_by_practice_joint_id" btree (practice_id, chemical_id)
现在我想在表上运行LIKE查询。这是我的问题:
EXPLAIN (ANALYSE, BUFFERS) SELECT sum(pr.cost) as actual_cost,
sum(pr.items) as items, pr.practice_id as row_id,
pc.name as row_name, pr.processing_date as date
FROM vw_chemical_summary_by_practice pr
JOIN frontend_practice pc ON pr.practice_id=pc.code
WHERE (pr.chemical_id LIKE '0401%' )
GROUP BY pr.practice_id, pc.code, date
ORDER BY date, pr.practice_id;
这是EXPLAIN:http://explain.depesz.com/s/lYRT
的结果正如您所看到的,它的速度很慢,部分原因是它在近400万行上运行位图堆扫描。 (后续排序也很慢。)
我有什么办法可以加快速度吗?
我想知道是否应该创建一个进一步的物化视图,或者多列索引是否有帮助,以便Postgres可以查看索引而不是磁盘。
有什么方法可以让排序更有效率吗?
更新:这是物化视图的定义:
CREATE MATERIALIZED VIEW vw_chemical_summary_by_practice
AS SELECT processing_date, practice_id, chemical_id,
SUM(total_items) AS items, SUM(actual_cost) AS cost
FROM frontend_prescription
GROUP BY processing_date, practice_id, chemical_id
基础表:
id | integer | not null default nextval('frontend_prescription_id_seq'::regclass)
presentation_code | character varying(15) | not null
total_items | integer | not null
actual_cost | double precision | not null
processing_date | date | not null
practice_id | character varying(6) | not null
Indexes:
"frontend_prescription_pkey" PRIMARY KEY, btree (id)
"frontend_prescription_528f368c" btree (processing_date)
"frontend_prescription_6ea07fe3" btree (practice_id)
"frontend_prescription_idx_code" btree (presentation_code varchar_pattern_ops)
"frontend_prescription_idx_date_and_code" btree (processing_date, presentation_code)
更新2:如果不清楚,我需要在所有以0401开头的化学品中按实践和按月获得总支出和项目。
答案 0 :(得分:2)
-- assuming this is your original table:
CREATE TABLE practice_chemical_old
( processing_date date NOT NULL
, practice_id character varying(6) NOT NULL
, chemical_id character varying(9) NOT NULL
, items bigint NOT NULL DEFAULT NULL
, cost double precision
);
-- create these three new tables to decompose it into
CREATE TABLE practice
( practice_id SERIAL NOT NULL PRIMARY KEY
, practice_name character varying(6) UNIQUE
);
CREATE TABLE chemical
( chemical_id SERIAL NOT NULL PRIMARY KEY
, chemical_name character varying(9) UNIQUE
);
CREATE TABLE practice_chemical_new
( practice_id INTEGER NOT NULL REFERENCES practice (practice_id)
, chemical_id INTEGER NOT NULL REFERENCES chemical (chemical_id)
, processing_date date NOT NULL
, items bigint NOT NULL default 0
, cost double precision
-- Not sure if processing_date should be part of the key, too
, PRIMARY KEY (practice_id, chemical_id)
);
CREATE UNIQUE INDEX ON practice_chemical_new(chemical_id, practice_id);
INSERT INTO practice(practice_name)
SELECT DISTINCT practice_id FROM practice_chemical_old;
INSERT INTO chemical(chemical_name)
SELECT DISTINCT chemical_id FROM practice_chemical_old;
-- now populate the new tables from the old ones ...
INSERT INTO practice_chemical_new(practice_id, chemical_id, processing_date,items,cost)
SELECT p.practice_id, c.chemical_id, pco.processing_date, pco.items, pco.cost
FROM practice_chemical_old pco
JOIN practice p ON p.practice_name = pco.practice_id
JOIN chemical c ON c.chemical_name = pco.chemical_id
;
-- Now, the original table *could* be represented by the following view (or table, or table expression):
CREATE VIEW practice_chemical_fake AS
SELECT pcn.processing_date AS processing_date
, p.practice_name AS practice_id
, c.chemical_name AS chemical_id
, pcn.items AS items
, pcn.cost AS cost
FROM practice_chemical_new pcn
JOIN practice p ON p.practice_id = pcn.practice_id
JOIN chemical c ON c.chemical_id = pcn.chemical_id
;
注意:原始问题中不清楚是否可能有多个{practice,chemical}实例(使用不同的processing_date)。您可能需要稍微更改PK的定义。