如何修复这个极其缓慢的MYSQL查询

时间:2016-03-24 20:52:53

标签: mysql performance query-optimization

此查询的目的是列出某人与其有连接的不同用户(即,ID为256的用户所跟随或跟随的用户,但不包括阻止或被当前用户阻止的用户请求(ID为2的用户)

关系表非常简单。状态列可以是以下两个值之一:""或者"阻止":

mysql> describe relationships;
+-------------+--------------+------+-----+---------+----------------+
| Field       | Type         | Null | Key | Default | Extra          |
+-------------+--------------+------+-----+---------+----------------+
| id          | int(11)      | NO   | PRI | NULL    | auto_increment |
| follower_id | int(11)      | YES  | MUL | NULL    |                |
| followee_id | int(11)      | YES  | MUL | NULL    |                |
| created_at  | datetime     | YES  |     | NULL    |                |
| updated_at  | datetime     | YES  |     | NULL    |                |
| status      | varchar(191) | YES  | MUL | NULL    |                |
+-------------+--------------+------+-----+---------+----------------+

此查询目前大约需要58秒才能完成!用户256只有1500个连接。假设这是上下文,大约有10,000个用户行,5500个关系行。

SELECT DISTINCT `users`.*, 
    -- "followed" is just a flag indicating if user #2 is currently following a given user
    (
      SELECT COUNT(*) FROM `relationships`  
      WHERE `relationships`.`followee_id` = `users`.`id` 
        AND `relationships`.`follower_id` = 2
    ) AS 'followed'
FROM `users` 
INNER JOIN `relationships` 
ON (
  (`users`.`id` = `relationships`.`follower_id` 
    AND `relationships`.`followee_id` = 256
  ) 
  OR (`users`.`id` = `relationships`.`followee_id` 
    AND `relationships`.`follower_id` = 256
  )
)
WHERE `relationships`.`status` = 'following' 
  AND (
    -- Ensure we don't return users who are blocked by user #2 
    `users`.`id` NOT IN (
      SELECT `relationships`.`followee_id` 
      FROM `relationships` 
      WHERE `relationships`.`follower_id` = 2
        AND `relationships`.`status` = 'blocked'
    )
  )
  AND (
    -- Ensure we don't return users who are blocking user #2 
    `users`.`id` NOT IN (
      SELECT `relationships`.`follower_id` 
      FROM `relationships` 
      WHERE `relationships`.`followee_id` = 2 
        AND `relationships`.`status` = 'blocked'
    )
  )
ORDER BY `users`.`id` ASC 
LIMIT 10

这是relationships上的当前索引:

mysql> show index from relationships;
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table         | Non_unique | Key_name                                                      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| relationships |          0 | PRIMARY                                                       |            1 | id          | A         |        3002 |     NULL | NULL   |      | BTREE      |         |               |
| relationships |          0 | index_relationships_on_status_and_follower_id_and_followee_id |            1 | status      | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          0 | index_relationships_on_status_and_follower_id_and_followee_id |            2 | follower_id | A         |        3002 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          0 | index_relationships_on_status_and_follower_id_and_followee_id |            3 | followee_id | A         |        3002 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          1 | index_relationships_on_followee_id                            |            1 | followee_id | A         |        3002 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          1 | index_relationships_on_follower_id                            |            1 | follower_id | A         |        3002 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          1 | index_relationships_on_status_and_followee_id_and_follower_id |            1 | status      | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          1 | index_relationships_on_status_and_followee_id_and_follower_id |            2 | followee_id | A         |        3002 |     NULL | NULL   | YES  | BTREE      |         |               |
| relationships |          1 | index_relationships_on_status_and_followee_id_and_follower_id |            3 | follower_id | A         |        3002 |     NULL | NULL   | YES  | BTREE      |         |               |
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

解释结果:

mysql> EXPLAIN SELECT DISTINCT `users`.*, (SELECT COUNT(*) FROM `relationships` WHERE `relationships`.`followee_id` = `users`.`id` AND `relationships`.`follower_id` = 2) AS 'followed' FROM `users` INNER JOIN `relationships` ON(`users`.`id` = `relationships`.`follower_id` AND `relationships`.`followee_id` = 256) OR (`users`.`id` = `relationships`.`followee_id` AND `relationships`.`follower_id` = 256) WHERE `relationships`.`status` = 'following' AND (`users`.`id` NOT IN (SELECT `relationships`.`followee_id` FROM `relationships` WHERE `relationships`.`follower_id` = 2 AND `relationships`.`status` = 'blocked')) AND (`users`.`id` NOT IN (SELECT `relationships`.`follower_id` FROM `relationships` WHERE `relationships`.`followee_id` = 2 AND `relationships`.`status` = 'blocked')) ORDER BY `users`.`id` ASC LIMIT 10;
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
| id | select_type        | table         | type        | possible_keys                                                                                                                                                                                     | key                                                                   | key_len | ref                           | rows | Extra                                                                                                                            |
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
|  1 | PRIMARY            | relationships | index_merge | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_followee_id,index_relationships_on_follower_id | 5,5     | NULL                          |    2 | Using union(index_relationships_on_followee_id,index_relationships_on_follower_id); Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | users         | ALL         | PRIMARY                                                                                                                                                                                           | NULL                                                                  | NULL    | NULL                          | 1534 | Range checked for each record (index map: 0x1)                                                                                   |
|  4 | SUBQUERY           | relationships | ref         | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_status_and_follower_id_and_followee_id         | 767     | const                         |    1 | Using where; Using index                                                                                                         |
|  3 | SUBQUERY           | relationships | ref         | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_status_and_follower_id_and_followee_id         | 772     | const,const                   |    1 | Using where; Using index                                                                                                         |
|  2 | DEPENDENT SUBQUERY | relationships | ref         | index_relationships_on_followee_id,index_relationships_on_follower_id                                                                                                                             | index_relationships_on_followee_id                                    | 5       | development.users.id |    1 | Using where                                                                                                                      |
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.01 sec)

3 个答案:

答案 0 :(得分:1)

如果没有对此进行测试,很难给出具体的答案,但我认为这部分问题是问题

SELECT DISTINCT `users`.*, (
      SELECT COUNT(*) FROM `relationships`  
      WHERE `relationships`.`followee_id` = `users`.`id` 
        AND `relationships`.`follower_id` = 2
    ) AS 'followed'

你也在使用order by。删除DISTINCT并按顺序查看是否加快了速度。我知道它会改变查询,但我怀疑不同的是基本上构建了一堆临时表,并将它们丢弃,以便检查它需要检查的每一行。看看这里

http://dev.mysql.com/doc/refman/5.7/en/distinct-optimization.html

计数可能很慢。确保计数在最快的列中运行。看到这个......

https://www.percona.com/blog/2007/04/10/count-vs-countcol/

思考SQL的好方法是在SETS中。幸运的是,MySQL支持子查询。

https://dev.mysql.com/doc/refman/5.7/en/from-clause-subqueries.html

一些伪SQL跟随......

select user_id
from relationships as follower, relationships as followee
where ...

在上面我们有两组我们可以操作。使用子查询,这非常有趣

select user_id
from (select user_id as f1 from relationships where ...) as follower, 
     (select user_id as f2 from relationships where ...) as followee
where ...

我总是发现像上面这样的一种简单方法来考虑自我引用表。

答案 1 :(得分:1)

很难确切地说明如何优化查询和结构,首先是一般提示:

  1. 使用整数/位/枚举而不是varchars
  2. 尽可能使用非空列
  3. 通常使用无符号列(至少具有更大的范围)是有意义的
  4. 尝试不同的方法来构建查询(请参见下文)
  5. 分明是非常昂贵的操作
  6. 子查询有时要快得多加入
  7. 无论如何,我已经准备好了sample fiddle with proposed optimizations,我已经更改了列的名称以减少混淆

    最终查询可能如下所示:

    (?:\d{2}|\d{4})[-\/\.\s]\d{2}[-\/\.\s](?:\d{4}|\d{2})
    但是,它可以改写如下:

    select *
    from users a
    where
    (
    id in (select follower_id as id from relationships USE INDEX (user_id) where user_id = 256 and status = 'following')
    or id in (select user_id from relationships USE INDEX (follower_id) where follower_id = 256 and status = 'following')
    )
    and id not in (select follower_id from relationships USE INDEX (user_id) where user_id = 2 and status = 'blocked')
    and id not in (select user_id from relationships USE INDEX (follower_id) where follower_id = 2 and status = 'blocked')
    

    基准测试,尽管执行计划 - 实际数据的实际性能可能不同

答案 2 :(得分:0)

不要使用IN ( SELECT ... ),它的优化效果不佳。相反,要么使用JOIN,要么使用EXISTS ( SELECT ... )

ORUNION技巧很好,但如果它仍在IN(...)内,则不行。

(为了便于阅读,请在只有一个表格时省略表格名称。并重命名followee_id和/或follower_id;拼写时它们彼此距离太近。)