我有四张表来提取user: first_name
,mongouser: email, card_status
,transaction: transaction_type, balance, posted_at, is_atm, is_purchase
,user_login: user_id, login_date, login_id
...
在我添加第四个表 - user_login
之前,一切都很有效。然而,第四个JOIN使一切都变得缓慢。我写了如下所示的查询
SELECT * FROM
(SELECT
ssluserid,
first_name,
m.email,
zipcode,
date_part('year',age(birthday)) AS birthday,
(current_date - DATE(created_date)) AS duration,
CASE WHEN card_status = 'ACTIVE' THEN 1 ELSE 0 END AS IS_ACTIVE,
SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_AMT,
SUM(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_AMT,
COUNT(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_CT,
COUNT(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_CT,
MIN(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_LOAD,
MAX(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_LOAD,
MIN(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_SPEND,
MAX(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_SPEND,
SUM(CASE WHEN transaction_type = 'Debit' AND is_atm = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
THEN balance END) AS ATM_AMT,
SUM(CASE WHEN transaction_type = 'Debit' AND is_purchase = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
THEN balance END) AS POS_AMT,
SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
THEN balance END) AS LOAD_VOL,
COUNT(CASE WHEN DATE(login_date) >= CURRENT_DATE - INTERVAL '90 days' THEN
login_id END) AS CT_LOGIN
FROM
mongouser m
LEFT OUTER JOIN
user u
ON m.userid = u.id
LEFT OUTER JOIN transactions t
ON u.id = t.user_id
LEFT OUTER JOIN user_login l
ON m.userid = l.user_id
GROUP BY 1,2,3,4,5,6,7) t
WHERE LAST_LOAD >= CURRENT_DATE - INTERVAL '90 days'
ORDER BY 9 DESC;
此查询已运行近40分钟......有没有什么方法可以优化它?
答案 0 :(得分:1)
专注于您的陈述,您知道问题所在。你之前有过这个
LEFT OUTER JOIN user u
ON m.userid = u.id
你说事情并不慢。"然后你添加它,
LEFT OUTER JOIN user_login l
ON m.userid = l.user_id
你说事情变慢了。您可能在m.userid
上有索引。你有l.user_id
的索引吗?
CREATE INDEX foo ON user_login ( user_id );