Question

我正在使用Postgres 9.3并想知道是否有办法加速大型表上的特定查询。这些是我的表格：

                                      Table "public.frontend_prescription"
      Column       |          Type           |                             Modifiers
-------------------+-------------------------+--------------------------------------------------------------------
 id                | integer                 | not null default nextval('frontend_prescription_id_seq'::regclass)
 presentation_code | character varying(15)   | not null
 actual_cost       | double precision        | not null
 processing_date   | date                    | not null
 pct_id            | character varying(3)    | not null
Indexes:
    "frontend_prescription_pkey" PRIMARY KEY, btree (id)
    "frontend_prescription_4e2e609b" btree (pct_id)
    "frontend_prescription_528f368c" btree (processing_date)
    "frontend_prescription_b9b2c7ab" btree (presentation_code)
    "frontend_prescription_cost_by_pres_code" btree (presentation_code, pct_id, actual_cost)
    "frontend_prescription_presentation_code_69403ee04fda6522_like" btree (presentation_code varchar_pattern_ops)
    "frontend_prescription_presentation_code_varchar_pattern_ops_idx" btree (presentation_code varchar_pattern_ops)

           Table "public.frontend_pct"
      Column       |          Type           | Modifiers
-------------------+-------------------------+-----------
 code              | character varying(3)    | not null
 name              | character varying(200)  |
 org_type          | character varying(9)    | not null

这是我的查询，以便按月在所有CCG组织上花费presentation_code：

SELECT sum(frontend_prescription.actual_cost) as val, 
       frontend_prescription.pct_id as row_id, 
       frontend_prescription.processing_date as date, 
       frontend_pct.name as row_name 
FROM frontend_prescription, frontend_pct 
WHERE (presentation_code='0407041T0BBACAC') 
AND frontend_prescription.pct_id=frontend_pct.code 
AND frontend_pct.org_type='CCG' 
GROUP BY frontend_prescription.pct_id, frontend_pct.code, date 
ORDER BY date, row_id

以下是此查询EXPLAIN (ANALYSE, BUFFERS)的结果：http://explain.depesz.com/s/YrR5

看起来缓慢的部分是frontend_prescription上的位图堆扫描。有没有办法让这更快？特别是，我注意到它循环了211次（对于数据中找到的每个pct一次）。

该表有数百万行，所以我怀疑没有，但只是想查看是否有任何明显我可以做的事情。

Answer 1

-- the tables (Indexes omitted)
CREATE Table frontend_pct (
        code    varchar(3) not null PRIMARY KEY
        , name  varchar(200)
        , org_type      varchar(9)    not null
        );

CREATE TABLE frontend_prescription (
 id     SERIAL NOT NULL PRIMARY KEY
 , presentation_code varchar(15)        not null
 , actual_cost  double precision not null
 , processing_date      date not null
 , pct_id       varchar(3) not null REFERENCES frontend_pct(code)

        );

-- the rewritten query (shows some flaws)
EXPLAIN ANALYZE
SELECT
       pr.pct_id as row_id
       , pr.processing_date as zdate -- <<-- renamed this; "date" is a typename
       , pc.name as row_name
        , sum(pr.actual_cost) as val
FROM frontend_prescription pr
JOIN frontend_pct  pc ON pr.pct_id=pc.code AND pc.org_type='CCG'
WHERE pr.presentation_code='0407041T0BBACAC'
GROUP BY pr.pct_id, pc.code, zdate
        -- ^^ ------ ^^ <- these are the same (maybe you meant pc.name ??)
                        -- (which is functionally dependent on pc.code)
ORDER BY zdate, row_id  -- <-- strange order; why not the same as "GROUP BY"
        ;

Postgres：加快位图堆扫描？

1 个答案: