使用复合索引和持久列的MySQL查询优化

时间:2018-07-03 14:13:47

标签: mysql sql indexing mariadb

下面的查询正在MariaDB 10.0.28上运行,大约需要17秒,我希望可以大大加快它的速度。

select series_id,delivery_date,delivery_he,forecast_date,forecast_he,value 
from forecast where forecast_he=8 
AND series_id in (12142594,20735627,632287496,1146453088,1206342447,1154376340,2095084238,2445233529,2495523920,2541234725,2904312523,3564421486) 
AND delivery_date >= '2016-07-13' 
AND delivery_date < '2018-06-27' 
and DATEDIFF(delivery_date,forecast_date)=1

加快速度的第一个尝试是创建一个持久性列,如(datediff(delivery_date,forecast_date)),使用该持久性列重建索引,然后修改查询,将da​​tediff calc替换为Forecast_delivery_delta = 1

> describe forecast;
+-------------------------+------------------+------+-----+---------+------------+
| Field                   | Type             | Null | Key | Default | Extra      |
+-------------------------+------------------+------+-----+---------+------------+
| series_id               | int(10) unsigned | NO   | PRI | 0       |            |
| delivery_date           | date             | NO   | PRI | NULL    |            |
| delivery_he             | int(11)          | NO   | PRI | NULL    |            |
| forecast_date           | date             | NO   | PRI | NULL    |            |
| forecast_he             | int(11)          | NO   | PRI | NULL    |            |
| value                   | float            | NO   |     | NULL    |            |
| forecast_delivery_delta | tinyint(4)       | YES  |     | NULL    | PERSISTENT |
+-------------------------+------------------+------+-----+---------+------------+

> show index from forecast;
+----------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name             | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| forecast |          0 | PRIMARY              |            1 | series_id     | A         |       35081 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          0 | PRIMARY              |            2 | delivery_date | A         |      130472 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          0 | PRIMARY              |            3 | delivery_he   | A         |     1290223 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          0 | PRIMARY              |            4 | forecast_date | A         |     2322401 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          0 | PRIMARY              |            5 | forecast_he   | A         |    23224016 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          1 | he_series_delta_date |            1 | forecast_he   | A         |       29812 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          1 | he_series_delta_date |            2 | series_id     | A         |       74198 |     NULL | NULL   |      | BTREE      |         |               |
| forecast |          1 | he_series_delta_date |            3 | delivery_date | A         |      774133 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

这似乎已使运行时间减少了约2秒钟,但我想知道是否有更好的方法可以大大加快此速度。我曾考虑过调整缓冲区大小,但似乎并没有配置错误。

>show variables like '%innodb_buffer_pool_size%';
+-------------------------+-----------+
| Variable_name           | Value     |
+-------------------------+-----------+
| innodb_buffer_pool_size | 134217728 |
+-------------------------+-----------+


Total table size:
+----------+------------+
| Table    | Size in MB |
+----------+------------+
| forecast |    1547.00 |
+----------+------------+

EXPLAIN:
+------+-------------+----------+-------+------------------------------+----------------------+---------+------+--------+-----------------------+
| id   | select_type | table    | type  | possible_keys                | key                  | key_len | ref  | rows   | Extra                 |
+------+-------------+----------+-------+------------------------------+----------------------+---------+------+--------+-----------------------+
|    1 | SIMPLE      | forecast | range | PRIMARY,he_series_delta_date | he_series_delta_date | 11      | NULL | 832016 | Using index condition |
+------+-------------+----------+-------+------------------------------+----------------------+---------+------+--------+-----------------------+

1 个答案:

答案 0 :(得分:1)

如果你要说

AND forecast_delivery_delta=1

然后最佳索引是一个开始,其中包含两个=列:

(forecast_he, forecast_delivery_delta,    -- in either order
 series_id,           -- an IN might work ok next
 delivery_date)       -- finally a range

通常,将列(delivery_date)通过除最后一个以外的其他范围进行测试是没有用的。

但是请注意,如果您说Forecast_delivery_delta <= 2,则该索引将无法很好地工作。现在它是一个“范围”,索引中后面的任何内容都不会用于过滤。还是有一些不同的索引是值得的,以防万一您将=变成一个范围,反之亦然。

然后将innodb_buffer_pool_size增加到大约70%的RAM(假设您有4GB以上的RAM)。