缓慢的MySQL查询与数百万条记录,分组和加入

时间:2014-06-03 12:26:16

标签: mysql performance

我有一个记录设备数据连接的表,该表目前为8200万行。我有一个nother表,其中包含设备信息,如位置,序列号,客户等,它有4000条记录。

表1具有以下结构:

+---------------+-------------+
| Column        | type        | 
+---------------+-------------+
| id            | int(11)     | Primary Key
| billing_date  | date        |
| imsi          | varchar(255)| 
| bytes_input   | int(10)     | 
| bytes_output  | int(10)     | 
| total_in_kibs | int(10)     | 
+-----+---------+-------------+
带索引的

:group_idx,包括imsi,billing_date,total_in _kibs,bytes_input,bytes_output

表2具有以下结构:

+---------------+-------------+
| Column        | type        | 
+---------------+-------------+
| id            | int(11)     | Primary Key
| System        | varchar(50) |
| Customer      | varchar(50) | 
| Sitecode      | varchar(50) | 
| Serialnumber  | varchar(50) | 
| Name          | varchar(50) | 
| imsi          | varchar(255)| 
+-----+---------+-------------+

索引:site_idx,包括imsi,名称,系统,客户,站点代码,序列号 id是主键

我想要做的是找到每个设备在给定月份内使用的总数据量total_in_kibs列。

我使用的查询是:

select x.imsi,y.name, y.Customer, y.System, y.Serialnumber, 
(x.bytesin / 1048576) as 'In (Mb)', (x.bytesout / 1048576) as 'Out (Mb)', 
(x.total / 1024) as 'Total (Mb)' 
FROM  
    (SELECT imsi, sum(bytes_input) as bytesin, 
    sum(bytes_output) as bytesout, sum(total_in_kibs) as total 
   FROM table1 
   WHERE month(billing_date) = 5 
   GROUP BY imsi) as x 
JOIN table2 as y 
on y.imsi = x.imsi 

explain命令提供以下内容:

+---+-----------+------------+------+---------------+----------+---------+------------+-------------------------------------+
| id|select_type| table      | type | possible_keys | key      | key_len | ref        | rows     | Extra                    |
+---+-----------+------------+------+---------------+----------+---------+------------+----------+--------------------------+
| 1 | PRIMARY   | y          | index | imsi,sign_idx| sign_idx | 479     | NULL       | 4100     | Using where; Using index |
| 1 | PRIMARY   | <derived2> | ref   | key0         | key0     | 258     | data.y.imsi| 20087    |                          |
| 2 | DERIVED   | table1     | index | NULL         | group_idx| 272     | NULL       | 82358731 | Using where; Using index |
+---+-----------+------------+------+---------------+----------+---------+------------+----------+--------------------------+

有没有办法加快查询速度?因为它将被用作网页上的搜索工具的一部分,因此页面加载的时间有点多。

先谢谢了 - 詹姆斯

Table 1
+-----------------------+-------------+------+-----+---------+----------------+
| Field                 | Type        | Null | Key | Default | Extra          |
+-----------------------+-------------+------+-----+---------+----------------+
| id                    | int(11)     | NO   | PRI | NULL    | auto_increment |
| billing_date          | date        | NO   | MUL | NULL    |                |
| unique_id             | char(10)    | NO   |     | NULL    |                |
| imsi                  | bigint(15)  | NO   |     | NULL    |                |
| tim_state             | char(2)     | NO   |     | NULL    |                |
| customer_profile_name | varchar(25) | NO   |     | NULL    |                |
| serving_opco          | char(5)     | NO   |     | NULL    |                |
| session_start         | datetime    | NO   |     | NULL    |                |
| session_end           | datetime    | NO   |     | NULL    |                |
| bytes_input           | int(10)     | NO   |     | NULL    |                |
| bytes_output          | int(10)     | NO   |     | NULL    |                |
| total_in_kibs         | int(10)     | NO   |     | NULL    |                |
+-----------------------+-------------+------+-----+---------+----------------+

Indxes for table 1
     --+------------+------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name         | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| transactions |          0 | PRIMARY          |            1 | id           | A         |    91714401 |     NULL | NULL   |      | BTREE      |         |               |
| transactions |          1 | billing_date_idx |            1 | billing_date | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
+--------------+------------+------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+


Table 2
+----------------------+-------------+------+-----+---------+----------------+
| Field                | Type        | Null | Key | Default | Extra          |
+----------------------+-------------+------+-----+---------+----------------+
| id                   | int(11)     | NO   | PRI | NULL    | auto_increment |
| System               | varchar(50) | YES  |     | NULL    |                |
| Customer             | varchar(50) | NO   |     | NULL    |                |
| Sitecode             | varchar(50) | NO   |     | NULL    |                |
| Name                 | varchar(50) | NO   |     | NULL    |                |
| Serialnumber         | varchar(10) | NO   |     | NULL    |                |
| Operator             | varchar(50) | NO   |     | NULL    |                |
| Sign_Serialnumber    | varchar(10) | NO   |     | NULL    |                |
| imsi                 | bigint(15)  | YES  | MUL | NULL    |                |
| dtInstalledDateGMT   | datetime    | NO   |     | NULL    |                |
| dtUninstalledDateGMT | datetime    | NO   |     | NULL    |                |
+----------------------+-------------+------+-----+---------+----------------+
Indexes for table 2

+-------+------------+----------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| signs |          0 | PRIMARY  |            1 | id           | A         |        4172 |     NULL | NULL   |      | BTREE      |         |               |
| signs |          1 | imsi     |            1 | imsi         | A         |        4172 |     NULL | NULL   | YES  | BTREE      |         |               |
| signs |          1 | sign_idx |            1 | imsi         | A         |        4172 |     NULL | NULL   | YES  | BTREE      |         |               |
| signs |          1 | sign_idx |            2 | Name         | A         |        4172 |     NULL | NULL   |      | BTREE      |         |               |
| signs |          1 | sign_idx |            3 | System       | A         |        4172 |     NULL | NULL   | YES  | BTREE      |         |               |
| signs |          1 | sign_idx |            4 | Customer     | A         |        4172 |     NULL | NULL   |      | BTREE      |         |               |
| signs |          1 | sign_idx |            5 | Sitecode     | A         |        4172 |     NULL | NULL   |      | BTREE      |         |               |
| signs |          1 | sign_idx |            6 | Serialnumber | A         |        4172 |     NULL | NULL   |      | BTREE      |         |               |
+-------+------------+----------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+


Explain for billing_date index query 
+------+-------------+-------+------+------------------+----------+---------+---------------------+----------+----------------------------------------------+
| id   | select_type | table | type | possible_keys    | key      | key_len | ref                 | rows     | Extra                                        |
+------+-------------+-------+------+------------------+----------+---------+---------------------+----------+----------------------------------------------+
|    1 | SIMPLE      | x     | ALL  | billing_date_idx | NULL     | NULL    | NULL                | 91714401 | Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | y     | ref  | imsi,sign_idx    | sign_idx | 9       | vodafonegdsp.x.imsi |        1 | Using index                                  |
+------+-------------+-------+------+------------------+----------+---------+---------------------+----------+----------------------------------------------+

1 个答案:

答案 0 :(得分:0)

如果表中的imsi是唯一的(怀疑它与表1中的结算日期相结合,并且在另一个表上可能是唯一的),那么您可以消除子查询: -

SELECT x.imsi,
        y.name, 
        y.Customer, 
        y.System, 
        y.Serialnumber, 
        (SUM(x.bytes_input) / 1048576) AS 'In (Mb)', 
        (SUM(x.bytes_output) / 1048576) AS 'Out (Mb)', 
        (SUM(x.total_in_kibs) / 1024) AS 'Total (Mb)' 
FROM table1 AS x 
INNER JOIN table2 AS y 
ON y.imsi = x.imsi 
WHERE month(x.billing_date) = 5 
GROUP BY x.imsi

另外,group_idx的表1上的索引(包括imsi,billing_date,total_in _kibs,bytes_input,bytes_output)似乎过多。我怀疑各种总数在该指数的末尾是有用的,只是通过imsi /结算日期缩小范围将非常有效。

修改

我正在努力寻找一种方法来强制它使用有用的索引。以下可能会做,但远非确定。

请尝试以下操作,但在billing_date(只是结算日期)上有一个索引: -

SELECT y.imsi,
        y.name, 
        y.Customer, 
        y.System, 
        y.Serialnumber, 
        (SUM(x.bytes_input) / 1048576) AS 'In (Mb)', 
        (SUM(x.bytes_output) / 1048576) AS 'Out (Mb)', 
        (SUM(x.total_in_kibs) / 1024) AS 'Total (Mb)' 
FROM table1 AS x 
INNER JOIN table2 AS y 
ON y.imsi = x.imsi 
AND x.billing_date BETWEEN '2014/05/01' AND '2014/05/31'
GROUP BY y.imsi,
        y.name, 
        y.Customer, 
        y.System, 
        y.Serialnumber

然后应该使用该索引返回table1中用于May的子行集。如果你在imsi上的table2上有索引(作为覆盖索引中的第一列,或索引中唯一的列),那么应该用于连接。

使用MONTH(billing_date)= 5将不使用索引,因为它必须在检查结果之前对行执行MONTH功能(8200万次)。检查日期范围应使用结算日期的索引。