JOIN不以可预测的方式使用索引

时间:2015-10-13 16:34:14

标签: mysql sql join indexing left-join

说我有三张桌子。

CREATE TABLE movies (
    id INT AUTO_INCREMENT,
    name VARCHAR(255),
    PRIMARY KEY (id)
);

CREATE TABLE movies_actors (
    id INT AUTO_INCREMENT,
    movie_id INT,
    actor_id INT,
    current_salary_id INT,
    PRIMARY KEY (id),
    KEY movie_id (movie_id),
    KEY actor_id (actor_id),
    KEY current_salary_id (current_salary_id)
);

CREATE TABLE movies_actors_salaries (
    id INT AUTO_INCREMENT,
    actor_id INT,
    compensation_type ENUM('salary','hourly','commission','lumpsum'),
    amount DECIMAL(9,2),
    date_agreed_upon DATETIME,
    PRIMARY KEY (id),
    KEY actor_id (actor_id)
);

我正在尝试加入表来进行一些查询,并且索引很偶然地被使用,我不知道为什么。

SELECT COUNT(1)
FROM movies m
JOIN movies_actors ma ON m.id = ma.movie_id
JOIN movies_actors_salaries mas ON ma.current_salary_id = mas.id;

如果我执行EXPLAIN,那么ma表的Extra列不会说“Using index”。如果我执行LEFT JOIN movies_actors_salariesJOIN movies_actors_salaries并不重要 - 它只是没有被使用。我不明白,因为m.id是电影表的PRIMARY KEY,ma.movi​​e_id是KEY。

我也尝试了另一个查询:

SELECT COUNT(1)
FROM movies m
JOIN movies_actors ma ON m.id = ma.movie_id
JOIN movies_actors_salaries mas ON ma.id = mas.actor_id;

如果我执行EXPLAIN,那么ma表的Extra列不会说“使用索引”,但如果我执行LEFT JOIN movies_actors_salaries而不是JOIN,则会使用索引。再一次,我不明白 - 为什么movie_actor表使用的索引取决于我加入movies_actors_salaries表的方式?

老实说,我不明白这一点。在我看来,当EXPLAIN完成时,所有四个的额外列(即上面两个带有JOIN movies_actors_salariesLEFT JOIN movies_actors_salaries)应该说“使用索引”。

我正在使用Percona MySQL 5.5.35-33.0。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

比{= 1}}的行= 1和Using where更值得关注:

ma

是这里看到的最后一个关键点:

mysql> explain SELECT COUNT(m.id) FROM movies m JOIN movies_actors ma ON m.id = ma.movie_id JOIN movies_actors_salaries mas ON ma.current_salary_id = mas.id;
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
| id | select_type | table | type   | possible_keys              | key     | key_len | ref                               | rows | Extra       |
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
|  1 | SIMPLE      | ma    | ALL    | movie_id,current_salary_id | NULL    | NULL    | NULL                              |    1 | Using where |
|  1 | SIMPLE      | mas   | eq_ref | PRIMARY                    | PRIMARY | 4       | so_gibberish.ma.current_salary_id |    1 | Using index |
|  1 | SIMPLE      | m     | eq_ref | PRIMARY                    | PRIMARY | 4       | so_gibberish.ma.movie_id          |    1 | Using index |
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
3 rows in set (0.05 sec)

导致新的可怕 -- drop table movies_actors; CREATE TABLE movies_actors ( id INT AUTO_INCREMENT, movie_id INT, actor_id INT, current_salary_id INT, PRIMARY KEY (id), KEY movie_id (movie_id), KEY actor_id (actor_id) -- KEY current_salary_id (current_salary_id) ); ,其中行= 1024和explainUsing where; Using join buffer (Block Nested Loop)using filesort在上述模式后看到改变和干扰行:

using temporary

The Takeaway

+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+ | 1 | SIMPLE | mas | index | PRIMARY | actor_id | 5 | NULL | 1 | Using index | | 1 | SIMPLE | ma | ALL | movie_id | NULL | NULL | NULL | 1024 | Using where; Using join buffer (Block Nested Loop) | | 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.movie_id | 1 | Using index | +----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+ 是神秘的,好像你不知道,但与你刚才提到的替代方案相比,你的行数低的事实应该是舒缓的(即:1k行和filesorts,临时表)

解释也是谎言。它是一个异想天开的幻想之地,预计会在一秒钟内呈现几条线,但当Explain被移除时,它会根据地面的现实改变路线。

我可以在Explain中有一行与您的联接相匹配,使用指数会建议movies_actors_salaries使用它,但我保证您不会因此而{{3}提取:

  

索引对于小型表或大型表的查询不太重要   报表查询处理大多数或所有行的位置。当一个查询   需要访问大多数行,顺序读取比快   通过索引工作。顺序读取可以最大限度地减少磁盘搜索   如果不是查询所需的所有行。

所以你很高兴。密切关注mas行数,以及使用filesorts和临时警告。