提高PostgreSQL查询性能

时间:2013-03-03 19:09:31

标签: sql performance postgresql database-design

在我的服务器中运行此查询时,速度很慢,我无法理解原因。任何人都可以帮我搞清楚吗? 查询:

SELECT
    "t_dat"."t_year" AS "c0",
    "t_dat"."t_month" AS "c1",
    "t_dat"."t_week" AS "c2",
    "t_dat"."t_day" AS "c3",
    "t_purs"."p_id" AS "c4",
    sum("t_purs"."days") AS "m0",
    sum("t_purs"."timecreated") AS "m1"
FROM "t_dat", "t_purs"
WHERE "t_purs"."created" = "t_dat"."t_key"
  AND "t_dat"."t_year" = 2013
  AND "t_dat"."t_month" = 3
  AND "t_dat"."t_week" = 9
  AND "t_dat"."t_day" IN (1,2)
  AND "t_purs"."p_id" IN (
      '4','15','18','19','20','29',
      '31','35','46','56','72','78')
GROUP BY
    "t_dat"."t_year",
    "t_dat"."t_month",
    "t_dat"."t_week",
    "t_dat"."t_day",
    "t_purs"."p_id"

解释分析:

HashAggregate  (cost=12252.04..12252.04 rows=1 width=28) (actualtime=10212.374..10212.384 rows=10 loops=1)
  ->  Nested Loop  (cost=0.00..12252.03 rows=1 width=28) (actual time=3016.006..10212.249 rows=14 loops=1)
        Join Filter: (t_dat.t_key = t_purs.created)
        ->  Seq Scan on t_dat  (cost=0.00..129.90 rows=1 width=20) (actual time=0.745..2.040 rows=48 loops=1)
              Filter: ((t_day = ANY ('{1,2}'::integer[])) AND (t_year = 2013) AND (t_month = 3) AND (t_week = 9))
        ->  Seq Scan on t_purs  (cost=0.00..12087.49 rows=9900 width=16) (actual time=0.018..201.630 rows=14014 loops=48)
              Filter: (p_id = ANY ('{4,15,18,19,20,29,31,35,46,56,72,78}'::integer[]))
Total runtime: 10212.470 ms

2 个答案:

答案 0 :(得分:6)

很难说你到底错过了什么,但如果我是你,我会确保跟随索引存在:

CREATE INDEX t_dat_id_date_idx
    ON t_dat (t_key, t_year, t_month, t_week, t_day);

对于t_purs,请创建此索引:

CREATE INDEX t_purs_created_p_id_idx
    ON t_purs (created, p_id);

答案 1 :(得分:1)

考虑在表格中使用单列

t_date date

而不是(t_year, t_month, t_week, t_day)。数据类型date占用4个字节。这会使你的表缩小一点,使索引更小更快,分组更容易。

可以通过{{从{{}日期轻松快速地提取3}}。您的查询可能看起来像这样,并且会更快:

SELECT extract (year  FROM t_date) AS c0
      ,extract (month FROM t_date) AS c1
      ,extract (week  FROM t_date) AS c2
      ,extract (day   FROM t_date) AS c3
      ,p.p_id                      AS c4
      ,sum(p.days)                 AS m0
      ,sum(p.timecreated)          AS m1
FROM   t_dat  d
JOIN   t_purs p ON p.created = d.t_key
WHERE  d.t_date IN ('2013-03-01'::date, '2013-03-02'::date)
AND    p.p_id IN (4,15,18,19,20,29,31,35,46,56,72,78)
GROUP  BY d.t_date, p.p_id;

对性能更重要的是索引,它只是:

CREATE INDEX t_dat_date_idx ON t_dat (t_key, t_date);

或者,取决于数据分布:

CREATE INDEX t_dat_date_idx ON t_dat (t_date, t_key);

extract()您甚至可以创建两者。