如何在索引扫描中提高PostgreSQL查询的性能

时间:2014-12-10 23:28:21

标签: sql postgresql database-indexes

在这个execution plan中,以下查询花费了大量时间(3.780秒)来仅对order_line表执行索引扫描(执行计划的第11行)。

表的主键:

Customer PK - c_w_id, c_d_id, c_id
OOrder PK - o_w_id, o_d_id, o_id
Order_line PK - ol_w_id, ol_d_id, ol_o_id, ol_number
Nation PK - n_nationkey

查询:

select   c_id, c_last, sum(ol_amount) as revenue, c_city, c_phone, n_name
from     customer, oorder, order_line, nation
where    c_id = o_c_id
     and c_w_id = o_w_id
     and c_d_id = o_d_id
     and ol_w_id = o_w_id
     and ol_d_id = o_d_id
     and ol_o_id = o_id
     and o_entry_d >= '2007-01-02 00:00:00.000000'
     and o_entry_d <= ol_delivery_d
     and n_nationkey = ascii(substr(c_state,1,1))
group by c_id, c_last, c_city, c_phone, n_name
order by revenue desc

如何改善此查询的效果?你推荐什么物化观点?这是一个不错的选择吗?

CREATE MATERIALIZED VIEW mview AS
select c_id, c_last, ol_amount, c_city, c_phone, o_entry_d, ol_delivery_d, c_state
from     customer, oorder, order_line
where    c_id = o_c_id
     and c_w_id = o_w_id
     and c_d_id = o_d_id
     and ol_w_id = o_w_id
     and ol_d_id = o_d_id
     and ol_o_id = o_id;

2 个答案:

答案 0 :(得分:2)

  1. 切勿使用此类列名。
  2. 如果不是1)那么永远不要在查询中写入没有表前缀的列名(不需要在列名中添加前缀)。
  3. 最好在&#34; join&#34;中编写联接。条款,不在&#34;其中&#34;因此,您可以使用连接的层次结构进行操作。
  4. 最后,你运行

    analyse customer; 
    analyse order_line; 
    analyse oorder; 
    analyse nation;
    

    在ascii(substr(c_state,1,1))上添加索引后? 如果没有 - 运行它。

    您也可以在select子句中将n_name作为子查询:

    select c_id, c_last, revenue, c_city, c_phone, (select x.n_name from nation x where x.n_nationkey = ascii(c_statecut)) as n_name 
      from (
    select   c_id, c_last, sum(ol_amount) as revenue, c_city, c_phone, substr(c_state,1,1) as c_statecut
    from     customer, oorder, order_line
    where    c_id = o_c_id
         and c_w_id = o_w_id
         and c_d_id = o_d_id
         and ol_w_id = o_w_id
         and ol_d_id = o_d_id
         and ol_o_id = o_id
         and o_entry_d >= '2007-01-02 00:00:00.000000'
         and o_entry_d <= ol_delivery_d
    group by c_id, c_last, c_city, c_phone, c_statecut
    ) finale
    order by revenue desc
    

    Nation表是一个简单的查找表,但它会影响聚合的整体计划。

    关于物化视图,仅对order_line表进行预聚合是获得总计的最快方法,但就你而言,&#34; o_entry_d&lt; = ol_delivery_d&#34;在查询条件中,您无法单独使用order_line表来计算聚合。尝试将其作为物化(和索引)视图,以便稍后与客户和国家联合。

    select o_c_id, ol_w_id, ol_d_id, ol_o_id, sum(ol_amount)
      from order_line 
      join oorder
        on     ol_w_id = o_w_id
           and ol_d_id = o_d_id
           and ol_o_id = o_id
           and o_entry_d <= ol_delivery_d -- (very strange condition, which looks redundant... it could be better if you can remove it)
     where o_entry_d >= '2007-01-02 00:00:00.000000'
     group by o_c_id, ol_w_id, ol_d_id, ol_o_id
    

    如果此查询大约需要3秒钟,那么实现它。如果少于3秒,那么您可以在运行中与其他人一起加入。

答案 1 :(得分:0)

您可以尝试使用ascii(substr(c_state,1,1))的表达式索引。您对客户进行了顺序扫描。