Postgrespgsql花费更多时间从表中检索超过15亿行的数据

时间:2017-07-18 19:03:29

标签: postgresql query-optimization rails-postgresql sql-optimization

如何优化表或查询以下pgsql查询(需要34分钟才能获得770条记录)?已经将索引添加到表中的几列。不确定还有什么可以进行此查询

查询:

SELECT 
    min(p.start_timestamp AT TIME ZONE p.timezone AT TIME ZONE 'America/Los_Angeles') as Date, 
    'America/Los_Angeles' AS Timezone, 
    sum(GREATEST(0, p.value)) as Value, 
    p.uom as UnitOfMeasurement
FROM
    pv.bsa_vessel_vs p                                 
WHERE
        p.start_timestamp AT TIME ZONE p.timezone >= '2017-01-01'
    and p.start_timestamp AT TIME ZONE p.timezone <  '2017-02-01'
    and p.vessel_serial_number ='U57625059'
GROUP BY
    date_trunc('hour', p.start_timestamp AT TIME ZONE p.timezone AT TIME ZONE 'America/Los_Angeles'), p.uom   
ORDER BY
    Date ;

表:

CREATE TABLE pv.bsa_vessel_vs
(
  bsa_vessel_vs_id bigserial NOT NULL,
  data_source_id bigint NOT NULL,
  start_timestamp timestamp without time zone NOT NULL,
  end_timestamp timestamp without time zone NOT NULL,
  value numeric(12,4) NOT NULL,
  uom text NOT NULL,
  timezone text NOT NULL,
  created_timestamp timestamp without time zone DEFAULT now(),
  updated_timestamp timestamp without time zone DEFAULT now(),
  vessel_serial_number text NOT NULL,
  CONSTRAINT bsa_vessel_vs_pkey PRIMARY KEY (bsa_vessel_vs_id),
  CONSTRAINT bsa_vessel_vs_data_source_id_fkey FOREIGN KEY (data_source_id)
      REFERENCES pv.data_source (data_source_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE RESTRICT
)
WITH (
  OIDS=FALSE
);

CREATE INDEX pm_start_timestamp_ndex
  ON pv.bsa_vessel_vs
  USING btree
  (start_timestamp DESC NULLS LAST);

CREATE INDEX bsa_vessel_vs_meter_ts_idx
  ON pv.bsa_vessel_vs
  USING btree
  (vessel_serial_number COLLATE pg_catalog."default", start_timestamp, end_timestamp);


CREATE UNIQUE INDEX bsa_vessel_vs_u_idx
  ON pv.bsa_vessel_vs
  USING btree
  (data_source_id, vessel_serial_number COLLATE pg_catalog."default", start_timestamp, end_timestamp DESC);

由于 Karthey

1 个答案:

答案 0 :(得分:0)

更改索引,使其包含您在WHERE子句中使用的表达式,即:

CREATE INDEX bsa_vessel_vs_meter_ts_2_idx
  ON bsa_vessel_vs
  USING btree
  ( vessel_serial_number COLLATE pg_catalog."default", 
    (start_timestamp AT TIME ZONE timezone), 
    (start_timestamp AT TIME ZONE timezone)
  );

定义该索引时,您将获得使用它的执行计划:

| QUERY PLAN                                                                                                                                                                                                                                                            |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Sort  (cost=69.60..69.70 rows=39 width=83)                                                                                                                                                                                                                            |
|   Sort Key: (min(timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp))))                                                                                                                                                                         |
|   ->  HashAggregate  (cost=67.79..68.57 rows=39 width=83)                                                                                                                                                                                                             |
|         Group Key: date_trunc('hour'::text, timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp))), uom                                                                                                                                          |
|         ->  Index Scan using bsa_vessel_vs_meter_ts_2_idx on bsa_vessel_vs p  (cost=0.28..67.20 rows=39 width=44)                                                                                                                                                     |
|               Index Cond: ((vessel_serial_number = 'U57625059'::text) AND (timezone(timezone, start_timestamp) >= '2017-01-01 00:00:00+00'::timestamp with time zone) AND (timezone(timezone, start_timestamp) < '2017-02-01 00:00:00+00'::timestamp with time zone)) |

然而,如果索引那么,PostgreSQL将转向全表扫描:

| QUERY PLAN                                                                                                                                                                                                                                                              |
| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Sort  (cost=298.84..298.94 rows=39 width=83)                                                                                                                                                                                                                            |
|   Sort Key: (min(timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp))))                                                                                                                                                                           |
|   ->  GroupAggregate  (cost=296.35..297.81 rows=39 width=83)                                                                                                                                                                                                            |
|         Group Key: (date_trunc('hour'::text, timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp)))), uom                                                                                                                                          |
|         ->  Sort  (cost=296.35..296.45 rows=39 width=44)                                                                                                                                                                                                                |
|               Sort Key: (date_trunc('hour'::text, timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp)))), uom                                                                                                                                     |
|               ->  Seq Scan on bsa_vessel_vs p  (cost=0.00..295.32 rows=39 width=44)                                                                                                                                                                                     |
|                     Filter: ((vessel_serial_number = 'U57625059'::text) AND (timezone(timezone, start_timestamp) >= '2017-01-01 00:00:00+00'::timestamp with time zone) AND (timezone(timezone, start_timestamp) < '2017-02-01 00:00:00+00'::timestamp with time zone)) |

您可以在 dbfiddle here

中查看所有设置