Question

背景：我有一个包含财务交易记录的表。该表有成千上万的行，可供成千上万的用户使用。我需要获取交易总额以显示余额和网站的其他方面。

我当前的查询会变得非常缓慢，并且经常会超时。我曾尝试优化查询，但似乎无法使其高效运行。

环境：我的应用程序正在使用Postgres Standard-2计划（8GB内存，400个最大连接数，256GB允许存储空间）在Heroku上运行。在任何给定时间，我的最大连接数约为20，而我当前的数据库大小为35GB。据统计，该查询平均运行1000毫秒左右，并且使用频率很高，对站点性能影响很大。

对于数据库，索引缓存命中率为99％，表缓存命中率为97％。根据当前的阈值，自动抽真空大约每隔一天运行一次。

这是我当前的交易表格设置：

CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;

这是我当前的查询：

SELECT account_id,
 sum(deposit) - sum(withdrawal) AS balance,
 sum(r_deposit)-sum(r_withdrawal) AS r_balance,
 sum(deposit) AS o_deposit,
 sum(withdrawal) AS o_withdrawal,
 sum(r_deposit) AS r_deposit,
 sum(r_withdrawal) AS r_withdrawal
FROM 
(SELECT t.account_id,
    CASE
        WHEN transaction_type > 0 THEN sum(amount)
        ELSE 0
    END AS deposit,
    CASE
        WHEN transaction_type = 0 THEN sum(amount)
        ELSE 0
    END AS withdrawal,
    CASE
        WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
        ELSE 0
    END AS r_deposit,
    CASE
        WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
        ELSE 0
    END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY  transaction_type, account_id, reconciled ) AS t0
GROUP BY  account_id;

查询包含几个部分。我必须为用户拥有的每个帐户获取以下信息：

1）帐户总余额

2）所有对帐交易的余额

3）分别列出所有存款，取款，对帐存款和对帐提款的总和。

当我对查询运行explain analyze时，这是一个查询计划：

QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
   Group Key: t.account_id
   ->  HashAggregate  (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
         Group Key: t.transaction_type, t.account_id, t.reconciled
         ->  Bitmap Heap Scan on transactions t  (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
               Recheck Cond: (user_id = 1)
               Filter: ((parent = 0) AND (ccparent = 0))
               Rows Removed by Filter: 2
               Heap Blocks: exact=6291
               ->  Bitmap Index Scan on transactions_user_id_key  (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
                     Index Cond: (user_id = 1)
 Planning time: 0.895 ms
 Execution time: 1326.424 ms

有人对如何加快此查询有任何建议吗？就像我说的那样，它是应用程序中运行最频繁的查询，也是对数据库要求最高的查询之一。如果我可以对此进行优化，那么它将对整个应用程序产生巨大的好处。

Answer 1

尝试在transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)上获取索引。

CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
             ON transactions
                (user_id,
                 parent,
                 ccparent,
                 transaction_type,
                 account_id,
                 reconciled);

也许您甚至可以在索引中包含amount。

CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
             ON transactions
                (user_id,
                 parent,
                 ccparent,
                 transaction_type,
                 account_id,
                 reconciled,
                 amount);

如何通过子查询和case语句加快PostgreSQL聚合选择

1 个答案: