在MySQL中加入临时表时,GROUP BY非常慢

时间:2016-12-22 04:03:37

标签: mysql performance group-by

表结构很简单:

CREATE TABLE `trade` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `account` int(11) NOT NULL,
  `date` date NOT NULL,
  `amount` double DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `all_idx` (`date`,`account`,`amount`) USING BTREE
) ENGINE=InnoDB;

此表中有大约5M条记录。

要求是:

  • 给出日期范围
  • 在日期范围内找到每个帐户的 FIRST MAXIMUM 交易金额
  • 找到 MINIMUM 交易金额 AFTER
  • 计算这两个金额之间的 DIFFERENCE (可能为0)

以下是我编写SQL的方法:

-- step 1: find the max amount, took about 0.6s
select account, max(amount) max_amount
from trade
where date between '20160101' and '20161220'
group by account;

-- step 2: find the first date, took about 1s
drop temporary table if exists tmp_max_amount;
create temporary table tmp_max_amount
select t1.account, min(t1.date) date, t1.amount
from trade t1, (
    select account, max(amount) max_amount
    from trade
    where date between '20160101' and '20161220'
    group by account
) t2
where t1.account = t2.account and t1.amount = t2.amount
group by t1.account, t1.amount;

-- step 3: find the min amount, took about 50s
drop temporary table if exists tmp_min_amount;
create temporary table tmp_min_amount
select t1.account, min(t1.amount) min_amount
from trade t1, tmp_max_amount t2
where t1.account = t2.account and t1.date >= t2.date
group by t1.account;

-- step 4: calculate the difference, took about 0.8s
select x.account, (max_amount - min_amount) diff
from tmp_max_amount x, tmp_min_amount n
where x.account = n.account;

步骤3中的SQL花了大约50秒。有没有办法提高速度?

示例数据:

    id | account | date     | amount
 ------|---------|----------|---------
     1 |    1000 | 20151001 |   1000 <- not in range
     2 |    3000 | 20151002 |    100 <- not in range
     3 |    1000 | 20160105 |    800 <- max of 1000
     4 |    2000 | 20160110 |    200 <- max of 2000
     5 |    2000 | 20160115 |    100 <- min of 2000
     6 |    3000 | 20160201 |   1200
....
 10000 |    2000 | 20161210 |    200 <- no the first max
 10001 |    3000 | 20161210 |    500
 10002 |    3000 | 20161212 |   1500 <- max & min of 3000
 10003 |    1000 | 20161213 |    300 <- min of 1000

预期结果:

account | diff
--------|------
   1000 |  500 <- (800 - 300)
   2000 |  100 <- (200 - 100)
   3000 |    0 <- (1500 - 1500)
...

1 个答案:

答案 0 :(得分:1)

请使用JOIN...ON语法。

第2步需要INDEX(account, amount)

步骤3需要通过执行

在步骤2中最容易创建的索引
create temporary table tmp_max_amount
    ( INDEX(account, date) )   -- This was added
SELECT ..;

(这可能不是最佳的,但它应该有所帮助。)