背景:我有一个包含财务交易记录的表。该表有成千上万的行,可供成千上万的用户使用。我需要获取交易总额以显示余额和网站的其他方面。
我当前的查询会变得非常缓慢,并且经常会超时。我曾尝试优化查询,但似乎无法使其高效运行。
环境:我的应用程序正在使用Postgres Standard-2计划(8GB内存,400个最大连接数,256GB允许存储空间)在Heroku上运行。在任何给定时间,我的最大连接数约为20,而我当前的数据库大小为35GB。据统计,该查询平均运行1000毫秒左右,并且使用频率很高,对站点性能影响很大。
对于数据库,索引缓存命中率为99%,表缓存命中率为97%。根据当前的阈值,自动抽真空大约每隔一天运行一次。
这是我当前的交易表格设置:
CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;
这是我当前的查询:
SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;
查询包含几个部分。我必须为用户拥有的每个帐户获取以下信息:
1)帐户总余额
2)所有对帐交易的余额
3)分别列出所有存款,取款,对帐存款和对帐提款的总和。
当我对查询运行explain analyze
时,这是一个查询计划:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms
有人对如何加快此查询有任何建议吗?就像我说的那样,它是应用程序中运行最频繁的查询,也是对数据库要求最高的查询之一。如果我可以对此进行优化,那么它将对整个应用程序产生巨大的好处。
答案 0 :(得分:0)
尝试在transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)
上获取索引。
CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);
也许您甚至可以在索引中包含amount
。
CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);