使用groupby和orderby的MySQL查询中的性能问题

时间:2013-09-29 12:27:54

标签: mysql query-optimization query-performance slowdown

1)使用了第一个查询...大约需要23秒

select a.id from mza_movie_upload a,mza_movie_statics b 
where a.status=1 and b.download=1 and a.id=b.rid 
group by b.rid order by sum(b.download) desc

目前我修改了查询...大约需要9秒

select a.id from mza_movie_upload a 
INNER JOIN mza_movie_statics b 
ON a.id=b.rid WHERE a.status=1 and b.download=1 
group by b.rid order by sum(b.download) desc

explain select a.id from mza_movie_upload a  INNER JOIN mza_movie_statics b  ON     a.id=b.rid WHERE a.status=1 and b.download=1  group by b.rid order by sum(b.download) desc;
+----+-------------+-------+--------+---------------+---------+---------+----------------------+---------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                  | rows    | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+----------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | b     | ALL    | NULL          | NULL    | NULL    | NULL                 | 1603089 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | a     | eq_ref | PRIMARY       | PRIMARY | 4       | mmdfurni_dev11.b.rid |       1 | Using where                                  |
+----+-------------+-------+--------+---------------+---------+---------+----------------------+---------+----------------------------------------------+
2 rows in set (0.03 sec)

我不确定要做的表现是什么?我想这个查询很快.. 我试图索引rid和id仍然使查询变得更糟。

以下是表格详情

mza_movie_upload

+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO   | PRI | NULL    | auto_increment |
| userid        | varchar(200) | NO   |     | NULL    |                |
| email         | varchar(200) | NO   |     | NULL    |                |
| up_date       | datetime     | NO   |     | NULL    |                |
| file_size     | varchar(200) | NO   |     | NULL    |                |
| temp_filename | varchar(200) | NO   |     | NULL    |                |
| fileneame     | varchar(200) | NO   | MUL | NULL    |                |
| filepath      | varchar(255) | NO   |     | NULL    |                |
| status        | varchar(20)  | NO   |     | NULL    |                |
| ip            | varchar(200) | NO   |     | NULL    |                |
| category      | varchar(200) | NO   |     | NULL    |                |
| mcode         | bigint(20)   | NO   |     | NULL    |                |
| movie_name    | varchar(200) | NO   |     | NULL    |                |
+---------------+--------------+------+-----+---------+----------------+
13 rows in set (0.00 sec)

mza_movie_statics

+-----------+---------+------+-----+---------+----------------+
| Field     | Type    | Null | Key | Default | Extra          |
+-----------+---------+------+-----+---------+----------------+
| id        | int(11) | NO   | PRI | NULL    | auto_increment |
| rid       | int(11) | NO   |     | NULL    |                |
| uid       | int(11) | NO   |     | NULL    |                |
| save      | int(11) | NO   |     | NULL    |                |
| download  | int(11) | NO   |     | NULL    |                |
| enterdate | date    | NO   |     | NULL    |                |
+-----------+---------+------+-----+---------+----------------+
6 rows in set (0.00 sec)

3 个答案:

答案 0 :(得分:0)

如果您希望获得进一步的性能提升,我建议您在a.status和/或b.download上应用索引。请记住,创建其他索引会带来额外的插入/更新/删除记录的开销 - 在这种情况下,它似乎是必要的。

此外,在向这些表添加新索引之前(可能在您的生产环境中)请记住,mysql将创建表的临时副本,对于具有大量记录(> 1百万)的表,可能需要而。 (所以我建议在类似大小的桌子上进行本地测试)

最后,我注意到在你的查询中你有where子句:a.status = 1但是status列是varchar。为了避免在两种不同的数据类型之间进行转换(这会减慢查询执行时间),并可能会破坏您未来的索引,我建议将其更改为:a.status ='1'(注意引号)

答案 1 :(得分:0)

尝试将查询重写为:

SELECT b.rid 
FROM mza_movie_upload a 
INNER JOIN mza_movie_statics b 
ON a.id=b.rid 
WHERE a.status= '1'  and b.download= '1'  
-- group by b.rid order by sum(b.download) desc;
GROUP BY b.rid ORDER BY count(*) DESC;

在此查询中,SELECT a.idSELECT b.rid取代,并且因为JOIN ... ON a.id=b.rid谓词而100%等同于原始查询,但导致MySql更好地计划了<登记/> 而且,正如@Dennis Leon所说,a.status= '1' and b.download= '1'被比作字符串,而不是数字。

另请尝试将order by sum(b.download) desc替换为order by count(*) desc - 因为查询仅检索b.download ='1'的行,然后sum( b.download )等同于count(*) - 此更改允许在SUM( .. )内从字符串转换为数字时节省几百毫秒。

最后创建两个索引:

create index bbbb on mza_movie_statics( download, rid );
create index aaaaa on mza_movie_upload( status );

然后在上述更改后尝试查询速度。

答案 2 :(得分:0)

如果您拥有被视为COVERING索引的内容,则可以更好地优化您的查询。那就是......索引具有与您要查找的内容相关联的列,包括条件。这样,引擎就不必转到原始数据来实际检查相应的状态并下载部件。

所以,在mza_movie_upload上有一个索引(id,status) 在mza_movie_statics上有一个索引(摆脱,下载)

接下来,group by将在驱动查询的索引上运行得最好,因为a.id = b.rid,但a.id可以作为驱动索引,让IT成为按值分组。

select
      mu.id
   from
      mza_movie_upload mu
         JOIN mza_movie_statics ms
            on mu.id = ms.rid
           AND ms.download > 0
   group by
      b.rid
   order by
      sum( b.download ) DESC

现在,对下载进行评论。它似乎是一个数字,所以您可能不希望明确地与'1'进行比较,因为看起来该列是下载某些内容的计数器。你正在寻找的是最常下载的东西。如果此值始终为1,则为是,将其保留为= 1而不是&gt; 0