我试图通过使用EXPLAIN的索引来提高某些查询的效果,我注意到每次我使用SHOW index FROM TableB;
rows
列中的EXPLAIN
列的输出查询已更改
例如:
mysql> EXPLAIN Select A.id
From TableA A
Inner join TableB B
On A.address = B.address And A.code = B.code
Group by A.id
Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | B | index | test_index | PRIMARY | 518 | NULL | 10561 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | A | eq_ref | PRIMARY | PRIMARY | 514 | db.B.address,db.B.code | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)
mysql> show index from TableB;
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| TableB | 0 | PRIMARY | 1 | id | A | 7 | NULL | NULL | | BTREE | |
| TableB | 0 | PRIMARY | 2 | address | A | 21 | NULL | NULL | | BTREE | |
| TableB | 0 | PRIMARY | 3 | code | A | 10402 | NULL | NULL | | BTREE | |
| TableB | 1 | test_index | 1 | address | A | 1 | NULL | NULL | | BTREE | |
| TableB | 1 | test_index | 2 | code | A | 10402 | NULL | NULL | | BTREE | |
| TableB | 1 | test_index | 3 | id | A | 10402 | NULL | NULL | | BTREE | |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.03 sec)
和...
mysql> EXPLAIN Select A.id
From TableA A
Inner join TableB B
On A.address = B.address And A.code = B.code Group by A.id
Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | B | index | test_index | PRIMARY | 518 | NULL | 9800 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | A | eq_ref | PRIMARY | PRIMARY | 514 | db.B.address,db.B.code | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)
为什么会这样?
答案 0 :(得分:3)
rows
列应仅作为粗略估算。这不是一个确切的数字。
它基于在查询期间将检查多少行的统计估计。在实际执行查询之前,无法知道实际的行数。
统计数据基于定期从表中读取的样本。偶尔会重新阅读这些示例,例如在您运行ANALYZE TABLE
或某些INFORMATION_SCHEMA查询或某些SHOW
语句之后。
答案 1 :(得分:0)
我没有发现20%的统计数据变化是个大问题。在许多情况下,想想图形就像一个上翘的抛物线,你需要知道你所在的最小点的哪一侧。在复杂查询中,优化器很可能会出现问题,它需要的不仅仅是简单的统计数据,例如MariaDB 10.0 / 10.1的直方图。 (我没有足够的经验来说明这是否会取得很大进展。)
您的特定查询可能只会以一种方式执行,无论统计信息如何。复杂查询的一个示例是JOIN
,其中WHERE
子句过滤每个表。优化器必须决定从哪个表开始。另一种情况是具有WHERE
和ORDER BY
的单个表,并且它们不能由单个索引处理 - 它是否应该使用索引进行过滤,但是必须进行排序?或者它应该使用ORDER BY
的索引,但是必须动态过滤?