Question

我们有一个非常大的数据库，我们在其中查询一个基于datetime列的数据库。昨天我们遇到一个问题，发现一个通常需要4秒钟的特定查询现在花费了40多个。

经过一些挖掘和调试，我们发现了问题。

mysql> explain select count(*) from event where survey_id = 158 and event_datez>'2019-10-30 00:00:00' and event_datez<'2019-11-28 23:59:59' ; # Query takes 4s
+----+-------------+--------------+------------+-------+-----------------------------------------------+------------------+---------+------+---------+----------+------------------------------------+
| id | select_type | table        | partitions | type  | possible_keys                                 | key              | key_len | ref  | rows    | filtered | Extra                              |
+----+-------------+--------------+------------+-------+-----------------------------------------------+------------------+---------+------+---------+----------+------------------------------------+
|  1 | SIMPLE      |        event | NULL       | range | FK_g1lx0ea096nqioytyhtjng72t, i_event_2       | i_event_2        | 6       | NULL | 2975160 |    50.00 | Using index condition; Using where |
+----+-------------+--------------+------------+-------+-----------------------------------------------+------------------+---------+------+---------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select count(*) from event where survey_id = 158 and event_datez>'2019-10-29 00:00:00' and event_datez<'2019-11-28 23:59:59' ; # Query takes 40s
+----+-------------+--------------+------------+------+-----------------------------------------------+------------------------------+---------+-------+----------+----------+-------------+
| id | select_type | table        | partitions | type | possible_keys                                 | key                          | key_len | ref   | rows     | filtered | Extra       |
+----+-------------+--------------+------------+------+-----------------------------------------------+------------------------------+---------+-------+----------+----------+-------------+
|  1 | SIMPLE      | event        | NULL       | ref  | FK_g1lx0ea096nqioytyhtjng72t,i_event_2        | FK_g1lx0ea096nqioytyhtjng72t | 9       | const | 16272884 |    12.23 | Using where |
+----+-------------+--------------+------------+------+-----------------------------------------------+------------------------------+---------+-------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

问题在于InnoDB更改了查询使用的索引。我的问题很简单，为什么会这样？

Answer 1

在书的索引中，为什么它们不包括“ the”或“ and”之类的常用词？因为它会匹配书中的每一页，并且在索引中查找值也没有用。您不妨只阅读本书中的所有页面，从头到尾。

如果MySQL估计条件将匹配大部分行，则将不使用索引。确切的阈值没有记录，但是根据我的经验，它大约是表格的20-25％。请注意，MySQL索引统计信息也不总是完美的。它们是根据采样数据得出的估算值。

在第二个查询中，日期的范围条件稍微宽一些。因此，它匹配更多行。可能这只是超出阈值而已，因此MySQL决定不使用i_event_2索引。

对于使用type: ref而不是type: range的查询优化计划，MySQL可能也略有偏好。

您可以使用index hint使MySQL仅考虑i_event_2索引。

select count(*) from event USE INDEX (i_event_2)
where survey_id = 158
  and event_datez>'2019-10-29 00:00:00' 
  and event_datez<'2019-11-28 23:59:59' ;

但我认为最好在两列上创建一个复合索引：

ALTER TABLE event ADD INDEX i_event_survey_datez (survey_id, event_datez);

2 MySQL中类似的查询，2性能非常不同，为什么？

1 个答案: