MySQL 100%CPU +慢查询 - 没有正确使用索引

时间:2014-05-30 14:21:04

标签: mysql database-performance amazon-rds query-performance mysql-slow-query-log

我正在使用亚马逊的RDS数据库和一些非常大的表,昨天我开始面临服务器上100%的CPU利用率以及之前没有发生的一堆慢速查询日志。

我尝试检查正在运行的查询并从explain命令

中面对此结果
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table                         | type   | possible_keys                                                                                | key                                   | key_len | ref                                                             | rows | Extra                                        |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | businesses                    | const  | PRIMARY                                                                                      | PRIMARY                               | 4       | const                                                           |    1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | activities_businesses         | ref    | PRIMARY,index_activities_users_on_business_id,index_tweets_users_on_tweet_id_and_business_id | index_activities_users_on_business_id | 9       | const                                                           | 2252 | Using index condition; Using where           |
|  1 | SIMPLE      | activities_b_taggings_975e9c4 | ref    | taggings_idx                                                                                 | taggings_idx                          | 782     | const,myapp_production.activities_businesses.id,const           |    1 | Using index condition; Using where           |
|  1 | SIMPLE      | activities                    | eq_ref | PRIMARY,index_activities_on_created_at                                                       | PRIMARY                               | 8       | myapp_production.activities_businesses.activity_id              |    1 | Using where                                  |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+

同时检查进程列表,我得到了类似的结果:

+----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User            | Host                                | db                         | Command | Time | State        | Info                                                                                                 |
+----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
|  1 | my_app          | my_ip:57152                         | my_app_production          | Sleep   |    0 |              | NULL                                                                                                 |
|  2 | my_app          | my_ip:57153                         | my_app_production          | Sleep   |    2 |              | NULL                                                                                                 |
|  3 | rdsadmin        | localhost:49441                     | NULL                       | Sleep   |    9 |              | NULL                                                                                                 |
|  6 | my_app          | my_other_ip:47802                   | my_app_production          | Sleep   |  242 |              | NULL                                                                                                 |
|  7 | my_app          | my_other_ip:47807                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
|  8 | my_app          | my_other_ip:47809                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
|  9 | my_app          | my_other_ip:47810                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
| 10 | my_app          | my_other_ip:47811                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
| 11 | my_app          | my_other_ip:47813                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
...

因此,基于这些数字,看起来没有理由进行慢查询,因为最糟糕的执行计划是经过2k行的计划并不多。

修改1

另一个可能有用的信息是慢query_log

SET timestamp=1401457485;
SELECT my_query...
# User@Host: myapp[myapp] @ ip-10-195-55-233.ec2.internal [IP] Id: 435
# Query_time: 95.830497 Lock_time: 0.000178 Rows_sent: 0 Rows_examined: 1129387

修改2

分析后,我得到了这个结果。结果大约有250行,每行有两列。

+----------------------+----------+
| state                | duration |
+----------------------+----------+
| Sending data         |      272 |
| removing tmp table   |        0 |
| optimizing           |        0 |
| Creating sort index  |        0 |
| init                 |        0 |
| cleaning up          |        0 |
| executing            |        0 |
| checking permissions |        0 |
| freeing items        |        0 |
| Creating tmp table   |        0 |
| query end            |        0 |
| statistics           |        0 |
| end                  |        0 |
| System lock          |        0 |
| Opening tables       |        0 |
| logging slow query   |        0 |
| Sorting result       |        0 |
| starting             |        0 |
| closing tables       |        0 |
| preparing            |        0 |
+----------------------+----------+

编辑3

按要求添加查询

SELECT activities.share_count, 
       activities.created_at 
FROM   `activities_businesses` 
       INNER JOIN `businesses` 
               ON `businesses`.`id` = `activities_businesses`.`business_id` 
       INNER JOIN `activities` 
               ON `activities`.`id` = `activities_businesses`.`activity_id` 
       JOIN taggings activities_b_taggings_975e9c4 
         ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
            AND activities_b_taggings_975e9c4.taggable_type = 
                'ActivitiesBusiness' 
            AND activities_b_taggings_975e9c4.tag_id = 104 
            AND activities_b_taggings_975e9c4.created_at >= 
                '2014-04-30 13:36:44' 
WHERE  ( businesses.id = 1 ) 
       AND ( activities.created_at > '2014-04-30 13:36:44' ) 
       AND ( activities.created_at < '2014-05-30 12:27:03' ) 
ORDER  BY activities.created_at; 

编辑4

由于taggable_id列上标记和activities_businesses之间的列类型不同,可能有可能未应用索引。

mysql> SHOW COLUMNS FROM activities_businesses;
+-------------+------------+------+-----+---------+----------------+
| Field       | Type       | Null | Key | Default | Extra          |
+-------------+------------+------+-----+---------+----------------+
| id          | int(11)    | NO   | PRI | NULL    | auto_increment |
| activity_id | bigint(20) | YES  | MUL | NULL    |                |
| business_id | bigint(20) | YES  | MUL | NULL    |                |
+-------------+------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)

mysql> SHOW COLUMNS FROM taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO   | PRI | NULL    | auto_increment |
| tag_id        | int(11)      | YES  | MUL | NULL    |                |
| taggable_id   | bigint(20)   | YES  |     | NULL    |                |
| taggable_type | varchar(255) | YES  |     | NULL    |                |
| tagger_id     | int(11)      | YES  |     | NULL    |                |
| tagger_type   | varchar(255) | YES  |     | NULL    |                |
| context       | varchar(128) | YES  |     | NULL    |                |
| created_at    | datetime     | YES  |     | NULL    |                |
+---------------+--------------+------+-----+---------+----------------+

所以它正在检查比解释查询中显示的行更多的行,可能是因为没有应用某些索引。

你们可以帮忙吗?

2 个答案:

答案 0 :(得分:0)

正如我在第4次编辑中所期待的那样,MySQL提供了DESCRIBE命令的误导性信息。

即使命令提供的执行计划告诉它如下:

+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table                         | type   | possible_keys                                                                                | key                                   | key_len | ref                                                             | rows | Extra                                        |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | businesses                    | const  | PRIMARY                                                                                      | PRIMARY                               | 4       | const                                                           |    1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | activities_businesses         | ref    | PRIMARY,index_activities_users_on_business_id,index_tweets_users_on_tweet_id_and_business_id | index_activities_users_on_business_id | 9       | const                                                           | 2252 | Using index condition; Using where           |
|  1 | SIMPLE      | activities_b_taggings_975e9c4 | ref    | taggings_idx                                                                                 | taggings_idx                          | 782     | const,myapp_production.activities_businesses.id,const           |    1 | Using index condition; Using where           |
|  1 | SIMPLE      | activities                    | eq_ref | PRIMARY,index_activities_on_created_at                                                       | PRIMARY                               | 8       | myapp_production.activities_businesses.activity_id              |    1 | Using where                                  |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+

显然这不是发生的事情,因为在慢查询中我们可以看到检查的行数量要大得多。

# Query_time: 95.830497 Lock_time: 0.000178 Rows_sent: 0 Rows_examined: 1129387

掌握了这些信息后,就可以验证连接并发现表中的ID没有使用相同的类型:

mysql> SHOW COLUMNS FROM activities_businesses;
+-------------+------------+------+-----+---------+----------------+
| Field       | Type       | Null | Key | Default | Extra          |
+-------------+------------+------+-----+---------+----------------+
| id          | int(11)    | NO   | PRI | NULL    | auto_increment |
3 rows in set (0.01 sec)

mysql> SHOW COLUMNS FROM taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO   | PRI | NULL    | auto_increment |

稍微研究一下这个主题,很容易说MySQL不会索引不同类型的列,或者至少不正确,即使EXPLAIN命令会这样说。

Joining on columns of different type?

Performance of JOIN using columns of different numeric types

所以最后在进行更改后将两列都转换为相同的类型后,查询的性能开始提高到50ms,这对我的方案来说已经足够了。

答案 1 :(得分:-1)

我怀疑order by子句是罪魁祸首。尝试几件事。向created_at列添加索引。如果有效,那就太好了! 否则,(或者如果它有效,你想要更快的查询)运行此查询,看看它是否需要更长的时间来执行:

SELECT activities.share_count, 
       activities.created_at 
FROM   `activities`  
       JOIN `businesses` 
               ON `businesses`.`id` = `activities_businesses`.`business_id` 
       JOIN `activities_businesses`
               ON `activities`.`id` = `activities_businesses`.`activity_id` 
       JOIN taggings activities_b_taggings_975e9c4 
         ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
            AND activities_b_taggings_975e9c4.taggable_type = 
                'ActivitiesBusiness' 
            AND activities_b_taggings_975e9c4.tag_id = 104 
            AND activities_b_taggings_975e9c4.created_at >= 
                '2014-04-30 13:36:44' 
WHERE  ( businesses.id = 1 ) 
       AND ( activities.created_at > '2014-04-30 13:36:44' ) 
       AND ( activities.created_at < '2014-05-30 12:27:03' ) 
ORDER  BY activities.created_at; 

如果能解决它,太棒了!

否则,(或者如果之前的建议有效并且您希望更快的查询速度更快),请执行以下两个步骤: 1.在没有order by子句的情况下运行相同的查询:

SELECT activities.share_count, 
       activities.created_at 
FROM   `activities`  
       JOIN `businesses` 
               ON `businesses`.`id` = `activities_businesses`.`business_id` 
       JOIN `activities_businesses`
               ON `activities`.`id` = `activities_businesses`.`activity_id` 
       JOIN taggings activities_b_taggings_975e9c4 
         ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
            AND activities_b_taggings_975e9c4.taggable_type = 
                'ActivitiesBusiness' 
            AND activities_b_taggings_975e9c4.tag_id = 104 
            AND activities_b_taggings_975e9c4.created_at >= 
                '2014-04-30 13:36:44' 
WHERE  ( businesses.id = 1 ) 
       AND ( activities.created_at > '2014-04-30 13:36:44' ) 
       AND ( activities.created_at < '2014-05-30 12:27:03' ) 

如果快速执行super duper,那就太好了。将结果转储到临时表中,然后对其进行排序。这会将昂贵的操作分解为更容易让服务器执行的小操作,即

CREATE TEMPORARY TABLE temp_activities (INDEX(created_at))
SELECT activities.share_count, 
           activities.created_at 
    FROM   `activities`  
           JOIN `businesses` 
                   ON `businesses`.`id` = `activities_businesses`.`business_id` 
           JOIN `activities_businesses`
                   ON `activities`.`id` = `activities_businesses`.`activity_id` 
           JOIN taggings activities_b_taggings_975e9c4 
             ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
                AND activities_b_taggings_975e9c4.taggable_type = 
                    'ActivitiesBusiness' 
                AND activities_b_taggings_975e9c4.tag_id = 104 
                AND activities_b_taggings_975e9c4.created_at >= 
                    '2014-04-30 13:36:44' 
    WHERE  ( businesses.id = 1 ) 
           AND ( activities.created_at > '2014-04-30 13:36:44' ) 
           AND ( activities.created_at < '2014-05-30 12:27:03' );
SELECT * FROM temp_activities ORDER BY created_at;