我担心性能。
是否可以优化以下mysql查询?
SELECT u.name, t2.transactions, o2.orders FROM users AS u
LEFT JOIN (
SELECT t.aid AS tuid, SUM( IF( t.status = 1, t.amount, 0 ) ) AS transactions
FROM transactions AS t
WHERE ( t.datetime BETWEEN ('2018-01-01 00:00:00') AND ( '2020-01-01 23:59:59' ) ) GROUP BY t.aid
) AS t2 ON tuid = u.id
LEFT JOIN (
SELECT o.aid AS ouid, SUM(o.net) AS orders FROM orders AS o
WHERE ( o.date BETWEEN ('2018-01-01 00:00:00') AND ( '2020-01-01 23:59:59' ) ) GROUP BY o.aid
) AS o2 ON ouid = u.id
WHERE u.status = 1
ORDER BY t2.transactions DESC
基本上,我需要汇总来自多个表的用户数据(并能够对其进行排序)
答案 0 :(得分:2)
查询中没有明显的查询性能反模式。性能在很大程度上取决于两个带有group by子句的子查询的性能。
让我们看一看其中的一些改进之处。
SELECT t.aid AS tuid,
SUM( IF( t.status = 1, t.amount, 0 ) ) AS transactions
FROM afs_transactions AS t
WHERE t.datetime BETWEEN '2018-01-01 00:00:00' AND '2020-01-01 23:59:59'
GROUP BY t.aid
如果您在afs_transactions.datetime
上有索引,则可以。
但是整个子查询都可以重写
SELECT t.aid AS tuid,
SUM( t.amount ) AS transactions
FROM afs_transactions AS t
WHERE t.datetime BETWEEN '2018-01-01 00:00:00' AND '2020-01-01 23:59:59'
AND t.status = 1
GROUP BY t.aid
此查询将利用(status, datetime)
上的复合索引。如果您有许多行,其中status
的值不等于1
,并且具有复合索引,则重写的查询会更快。
专业提示:对于日期时间值,BETWEEN
通常是一个糟糕的选择,因为59:59。尝试使用<
而不是BETWEEN的<=
作为范围的结尾。
WHERE t.datetime >= '2018-01-01'
AND t.datetime < '2020-01-02' /* notice, it's the day after the range */
答案 1 :(得分:0)
多个JOIN ( SELECT ... )
曾经是性能杀手(5.6之前的版本)。现在,它可能是一个性能问题。
替代方法是
SELECT u.name,
( SELECT ... WHERE ...=u.id ) AS transactions,
( SELECT ... WHERE ...=u.id ) AS orders
FROM users AS u
WHERE u.status = 1
ORDER BY transactions DESC
第一个子查询是一个相关子查询,看起来像
( SELECT SUM( IF(status = 1, amount, 0)
FROM transactions
WHERE aid = u.id
AND datetime >= '2018-01-01'
AND datetime < '2018-01-01' + INTERVAL 2 YEAR`
) AS transactions
(另一个相似。)
索引:
users: INDEX(status, name, id) -- "covering"
transactions: INDEX(aid, datetime)
orders: INDEX(aid, date) or INDEX(aid, date, net)