优化postgres查询

时间:2012-04-16 10:32:18

标签: sql postgresql sql-execution-plan

                                 QUERY PLAN                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=32164.87..32164.89 rows=1 width=44) (actual time=221552.831..221552.831 rows=0 loops=1)
   ->  Sort  (cost=32164.87..32164.87 rows=1 width=44) (actual time=221552.827..221552.827 rows=0 loops=1)
         Sort Key: t.date_effective, t.acct_account_transaction_id, p.method, t.amount, c.business_name, t.amount
         ->  Nested Loop  (cost=22871.67..32164.86 rows=1 width=44) (actual time=221552.808..221552.808 rows=0 loops=1)
               ->  Nested Loop  (cost=22871.67..32160.37 rows=1 width=52) (actual time=221431.071..221546.619 rows=670 loops=1)
                     ->  Nested Loop  (cost=22871.67..32157.33 rows=1 width=43) (actual time=221421.218..221525.056 rows=2571 loops=1)
                           ->  Hash Join  (cost=22871.67..32152.80 rows=1 width=16) (actual time=221307.382..221491.019 rows=2593 loops=1)
                                 Hash Cond: ("outer".acct_account_id = "inner".acct_account_fk)
                                 ->  Seq Scan on acct_account a  (cost=0.00..7456.08 rows=365008 width=8) (actual time=0.032..118.369 rows=61295 loops=1)
                                 ->  Hash  (cost=22871.67..22871.67 rows=1 width=16) (actual time=221286.733..221286.733 rows=2593 loops=1)
                                       ->  Nested Loop Left Join  (cost=0.00..22871.67 rows=1 width=16) (actual time=1025.396..221266.357 rows=2593 loops=1)
                                             Join Filter: ("inner".orig_acct_payment_fk = "outer".acct_account_transaction_id)
                                             Filter: ("inner".link_type IS NULL)
                                             ->  Seq Scan on acct_account_transaction t  (cost=0.00..18222.98 rows=1 width=16) (actual time=949.081..976.432 rows=2596 loops=1)
                                                   Filter: ((("type")::text = 'debit'::text) AND ((transaction_status)::text = 'active'::text) AND (date_effective >= '2012-03-01'::date) AND (date_effective < '2012-04-01 00:00:00'::timestamp without time zone))
                                             ->  Seq Scan on acct_payment_link l  (cost=0.00..4648.68 rows=1 width=15) (actual time=1.073..84.610 rows=169 loops=2596)
                                                   Filter: ((link_type)::text ~~ 'return_%'::text)
                           ->  Index Scan using contact_pk on contact c  (cost=0.00..4.52 rows=1 width=27) (actual time=0.007..0.008 rows=1 loops=2593)
                                 Index Cond: (c.contact_id = "outer".contact_fk)
                     ->  Index Scan using acct_payment_transaction_fk on acct_payment p  (cost=0.00..3.02 rows=1 width=13) (actual time=0.005..0.005 rows=0 loops=2571)
                           Index Cond: (p.acct_account_transaction_fk = "outer".acct_account_transaction_id)
                           Filter: ((method)::text <> 'trade'::text)
               ->  Index Scan using contact_role_pk on contact_role  (cost=0.00..4.48 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=670)
                     Index Cond: ("outer".contact_id = contact_role.contact_fk)
                     Filter: (exchange_fk = 74)
Total runtime: 221553.019 ms

4 个答案:

答案 0 :(得分:4)

你的问题在这里:

->  Nested Loop Left Join  (cost=0.00..22871.67 rows=1 width=16) (actual time=1025.396..221266.357 rows=2593 loops=1)
    Join Filter: ("inner".orig_acct_payment_fk = "outer".acct_account_transaction_id)
    Filter: ("inner".link_type IS NULL)
        ->  Seq Scan on acct_account_transaction t  (cost=0.00..18222.98 rows=1 width=16) (actual time=949.081..976.432 rows=2596 loops=1)
                Filter: ((("type")::text = 'debit'::text) AND ((transaction_status)::text = 'active'::text) AND (date_effective >= '2012-03-01'::date) AND (date_effective   
            Seq Scan on acct_payment_link l  (cost=0.00..4648.68 rows=1 width=15) (actual time=1.073..84.610 rows=169 loops=2596)
                Filter: ((link_type)::text ~~ 'return_%'::text)

它希望在acct_account_transaction中找到1行,而它找到2596,对于另一个表也是如此。

你没有提到你的postgres版本(可以吗?),但这应该可以解决问题:

SELECT DISTINCT
    t.date_effective,
    t.acct_account_transaction_id,
    p.method,
    t.amount,
    c.business_name,
    t.amount
FROM
    contact c inner join contact_role on (c.contact_id=contact_role.contact_fk and contact_role.exchange_fk=74),
    acct_account a, acct_payment p,
    acct_account_transaction t
WHERE
    p.acct_account_transaction_fk=t.acct_account_transaction_id
    and t.type = 'debit'
    and transaction_status = 'active'
    and p.method != 'trade'
    and t.date_effective >= '2012-03-01'
    and t.date_effective < (date '2012-03-01' + interval '1 month')
    and c.contact_id=a.contact_fk and a.acct_account_id = t.acct_account_fk
    and not exists(
         select * from acct_payment_link l 
           where orig_acct_payment_fk == acct_account_transaction_id 
           and link_type like 'return_%'
    )
ORDER BY
    t.date_effective DESC

另外,尝试为相关列设置适当的统计目标。链接到友好手册:http://www.postgresql.org/docs/current/static/sql-altertable.html

答案 1 :(得分:0)

我删除了我的第一个建议,因为它改变了查询的性质。

我发现LEFT JOIN花了太多时间。

  1. 首先要尝试只对acct_payment_link表进行一次扫描。您可以尝试将查询重写为:

    ... LEFT JOIN (SELECT * FROM acct_payment_link
                   WHERE link_type LIKE 'return_%') AS l ...
    
  2. 您应该检查统计信息,因为计划行数和返回行数之间存在差异。

  3. 您没有包含表格和索引的定义,最好看一下这些。

  4. 您可能还希望使用contrib/pg_tgrm扩展程序在acct_payment_link.link_type上构建索引,但我会将此作为最后一个选项来尝试。

  5. BTW,您正在使用的PostgreSQL版本是什么?

答案 2 :(得分:0)

你的索引是什么,你最近分析过吗?它正在acct_account_transaction上进行表扫描,即使该表上有多个条件:

  • date_effective

如果这些列上没有索引,那么复合一个(type, date_effective)可以提供帮助(假设有很多行不符合这些列的标准)。

答案 3 :(得分:0)

您的陈述被重写并格式化:

SELECT DISTINCT
       t.date_effective,
       t.acct_account_transaction_id,
       p.method,
       t.amount,
       c.business_name,
       t.amount
FROM   contact                  c
JOIN   contact_role            cr ON cr.contact_fk = c.contact_id
JOIN   acct_account             a ON a.contact_fk = c.contact_id 
JOIN   acct_account_transaction t ON t.acct_account_fk = a.acct_account_id 
JOIN   acct_payment             p ON p.acct_account_transaction_fk
                                   = t.acct_account_transaction_id
LEFT   JOIN acct_payment_link   l ON orig_acct_payment_fk
                                   = acct_account_transaction_id
                                        -- missing table-qualification!
                                 AND link_type like 'return_%'
                                        -- missing table-qualification!
WHERE  transaction_status = 'active'    -- missing table-qualification!
AND    cr.exchange_fk = 74
AND    t.type = 'debit'
AND    t.date_effective >= '2012-03-01'
AND    t.date_effective <  (date '2012-03-01' + interval '1 month')
AND    p.method != 'trade'
AND    l.link_type IS NULL
ORDER  BY t.date_effective DESC;
  • 最好使用显式JOIN语句。我根据您的JOIN逻辑重新排序了表格。

  • 为什么(date '2012-03-01' + interval '1 month')代替日期'2012-04-01'

  • 缺少某些表格资格。在这样一个复杂的陈述中,这是一种糟糕的风格。可能隐瞒了一个错误。

性能的关键是索引,适当的,适当的配置的PostgreSQL和准确的统计

General advice on performance tuning in the PostgreSQL wiki.