使用日期范围和连接优化MySQL查询

时间:2016-03-31 11:45:06

标签: mysql query-optimization

我有以下查询:

SELECT COUNT(*)
  FROM datum d

  JOIN datum_type dt
    ON dt.datum_id = d.id
   AND dt.type_id = '3' 

 WHERE d.added_time >=  DATE_FORMAT(CURDATE(), '%Y-%m')
   AND d.added_time <   DATE_FORMAT(CURDATE() + INTERVAL 1 MONTH, '%Y-%m')

d.id(PRIMARY),d.added_time,dt.datum_id和dt.type_id

上有索引

目前的解释计划是:

+----+-------------+-------+--------+--------------------+---------+---------+-------------+--------+-------------+
| id | select_type | table |  type  |   possible_keys    |   key   | key_len |     ref     |  rows  |    Extra    |
+----+-------------+-------+--------+--------------------+---------+---------+-------------+--------+-------------+
|  1 | SIMPLE      | dt    | ref    | type_id,datum_id   | type_id |       1 | const       | 602628 |             |
|  1 | SIMPLE      | d     | eq_ref | PRIMARY,added_time | PRIMARY |       8 | dt.datum_id |      1 | Using where |
+----+-------------+-------+--------+--------------------+---------+---------+-------------+--------+-------------+

由于我们有相当长的基准记录,它似乎首先使用datum.id PRIMARY加入该类型,然后扫描每个连接的行以查看datum.added_time是否在该范围内。

我尝试使用added_time索引,但解释计划是:

+----+-------------+-------+-------+------------------+------------+---------+------+---------+--------------------------+
| id | select_type | table | type  |  possible_keys   |    key     | key_len | ref  |  rows   |          Extra           |
+----+-------------+-------+-------+------------------+------------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | d     | index | added_time       | added_time |       4 | NULL | 6195194 | Using where; Using index |
|  1 | SIMPLE      | dt    | ref   | type_id,datum_id | datum_id   |       8 | d.id |       1 | Using where              |
+----+-------------+-------+-------+------------------+------------+---------+------+---------+--------------------------+  

几乎与datum.added_time范围内有多个不同datum_type.type_id的datum_types一样长。

是否有一些索引组合可以加快这个速度?

1 个答案:

答案 0 :(得分:1)

我假设added_timedatetimedate。然后,您应该将条件表示为字符串。相反,使用date常量:

SELECT COUNT(*)
FROM datum d JOIN
     datum_type dt
     ON dt.datum_id = d.id AND
        dt.type_id = '3' 
WHERE d.added_time >= DATE_SUB(CURDATE(), INTERVAL DAY(CURDATE()) - 1 DAY) AND
      d.added_time < DATE_ADD(DATE_SUB(CURDATE(), INTERVAL DAY(CURDATE()) - 1 DAY), INTERVAL 1 MONTH);

这可以利用datum(added_time, id)datum_type(datum_id, type_id)上的索引。

如果没有来自datum_type的重复记录(计数),我建议您将查询重写为:

SELECT COUNT(*)
FROM datum d
WHERE d.added_time >= DATE_SUB(CURDATE(), INTERVAL DAY(CURDATE()) - 1 DAY) AND
      d.added_time < DATE_ADD(DATE_SUB(CURDATE(), INTERVAL DAY(CURDATE()) - 1 DAY), INTERVAL 1 MONTH) AND
      EXISTS (SELECT 1
              FROM datum_type dt
              WHERE dt.datum_id = d.id AND dt.type_id = '3'
             );

如果type_id是一个整数,那么你应该删除单引号。在SQL中混合使用不同的数据类型会混淆优化并阻止索引的使用。