如何使索引良好的MySQL表有效连接

时间:2012-06-12 20:48:26

标签: mysql performance join indexing

这是第一个表'tbl1':

+---------+---------------------+------+-----+---------+----------------+
| Field   | Type                | Null | Key | Default | Extra          |
+---------+---------------------+------+-----+---------+----------------+
| val     | varchar(45)         | YES  | MUL | NULL    |                |
| id      | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
+---------+---------------------+------+-----+---------+----------------+

有索引:

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tbl1  |          0 | PRIMARY  |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl1  |          1 | val      |            1 | val         | A         |     2147085 |     NULL | NULL   | YES  | BTREE      |         |
| tbl1  |          1 | id_val   |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl1  |          1 | id_val   |            2 | val         | A         |   201826018 |     NULL | NULL   | YES  | BTREE      |         |
| tbl1  |          1 | val_id   |            1 | val         | A         |     2147085 |     NULL | NULL   | YES  | BTREE      |         |
| tbl1  |          1 | val_id   |            2 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

(进行一些额外索引的原因是:http://bit.ly/KWx1Xz。)

第二张表几乎相同。以下是它的索引基数:

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tbl2   |          0 | PRIMARY  |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl2   |          1 | val      |            1 | val         | A         |      881336 |     NULL | NULL   | YES  | BTREE      |         |
| tbl2   |          1 | id_val   |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl2   |          1 | id_val   |            2 | val         | A         |   201826018 |     NULL | NULL   | YES  | BTREE      |         |
| tbl2   |          1 | val_id   |            1 | val         | A         |      881336 |     NULL | NULL   | YES  | BTREE      |         |
| tbl2   |          1 | val_id   |            2 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

任务是将它们内部连接到val列并获取id列表(并在1秒内完成)。

以下是“加入”方法:

SELECT tbl1.id FROM tbl1 JOIN tbl2 ON tbl1.val = 'iii' AND tbl2.val = 'iii' AND tbl1.id = tbl2.id;

结果:设置了10831行( 55.15秒

查询说明:

+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+
| id | select_type | table  | type   | possible_keys                    | key     | key_len | ref                       | rows | Extra                    |
+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+
|  1 | SIMPLE      | tbl1   | ref    | PRIMARY,val,id_val,val_id        | val_id  | 138     | const                     | 5160 | Using where; Using index |
|  1 | SIMPLE      | tbl2   | eq_ref | PRIMARY,val,id_val,val_id        | PRIMARY | 8       | search_test.tbl1.id       | 1    | Using where              |
+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+

以下是'in'方法:

SELECT id FROM tbl1 WHERE val = 'iii' and id IN (SELECT id FROM tbl2 WHERE val = 'iii');

结果:1​​0831行( 1分10.15秒

说明:

+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+
| id | select_type        | table  | type            | possible_keys                   | key     | key_len | ref   | rows | Extra                    |
+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+
|  1 | PRIMARY            | tbl1   | ref             | val,val_id                      | val_id  | 138     | const | 8553 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | tbl2   | unique_subquery | PRIMARY,val,id_val,val_id       | PRIMARY | 8       | func  |    1 | Using where              |
+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+

所以,问题是:如何调整此查询以让MySQL在一秒钟内完成它?

2 个答案:

答案 0 :(得分:2)

SELECT tbl1.id FROM tbl1 JOIN tbl2 ON tbl1.id = tbl2.id and tbl1.val = tbl2.val
where tbl1.val = 'iii';

答案 1 :(得分:2)

好的,我已经在每张桌子的30,000多条记录上进行了测试,而且运行速度非常快。

目前看来,你现在正在两张大桌子上进行连接,但是如果你在' val'在每张桌子上首先会大幅减少连接的大小。

我最初将这个答案作为一组子查询发布但我没有意识到MySQL在嵌套子查询中的速度很慢,因为它从外部执行。但是如果你将子查询定义为视图,它会从从里到外。

所以,首先创建视图。

CREATE VIEW tbl1_iii AS (
SELECT * FROM tbl1 WHERE val='iii'
);
CREATE VIEW tbl2_iii AS (
SELECT * FROM tbl2 WHERE val='iii'
);

然后运行查询。

SELECT tbl1_iii.id from tbl1_iii,tbl2_iii
WHERE tbl1_iii.id = tbl2_iii.id;

闪电。