MySQL查询优化 - 不同,顺序和限制

时间:2010-05-27 03:37:44

标签: mysql query-optimization sql-execution-plan

我正在尝试优化以下查询:

select distinct this_.id as y0_
from Rental this_
    left outer join RentalRequest rentalrequ1_ 
      on this_.id=rentalrequ1_.rental_id
    left outer join RentalSegment rentalsegm2_ 
      on rentalrequ1_.id=rentalsegm2_.rentalRequest_id
where
    this_.DTYPE='B'
    and this_.id<=1848978
    and this_.billingStatus=1
    and rentalsegm2_.endDate between 1273631699529 and 1274927699529
order by rentalsegm2_.id asc
limit 0, 100;

此查询连续多次完成,用于记录的分页处理(每次具有不同的限制)。它返回我在处理中需要的ID。我的问题是这个查询需要超过3秒。我在这三个表中每个都有大约200万行。

解释给出:

+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
| id | select_type | table        | type   | possible_keys                                       | key           | key_len | ref                                        | rows   | Extra                                        |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
|  1 | SIMPLE      | rentalsegm2_ | range  | index_endDate,fk_rentalRequest_id_BikeRentalSegment | index_endDate | 9       | NULL                                       | 449904 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | rentalrequ1_ | eq_ref | PRIMARY,fk_rental_id_BikeRentalRequest              | PRIMARY       | 8       | solscsm_main.rentalsegm2_.rentalRequest_id |      1 | Using where                                  | 
|  1 | SIMPLE      | this_        | eq_ref | PRIMARY,index_billingStatus                         | PRIMARY       | 8       | solscsm_main.rentalrequ1_.rental_id        |      1 | Using where                                  | 
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+

我尝试删除distinct,并且查询运行速度提高了三倍。解释没有查询给出:

+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
| id | select_type | table        | type   | possible_keys                                       | key           | key_len | ref                                        | rows   | Extra                       |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
|  1 | SIMPLE      | rentalsegm2_ | range  | index_endDate,fk_rentalRequest_id_BikeRentalSegment | index_endDate | 9       | NULL                                       | 451972 | Using where; Using filesort | 
|  1 | SIMPLE      | rentalrequ1_ | eq_ref | PRIMARY,fk_rental_id_BikeRentalRequest              | PRIMARY       | 8       | solscsm_main.rentalsegm2_.rentalRequest_id |      1 | Using where                 | 
|  1 | SIMPLE      | this_        | eq_ref | PRIMARY,index_billingStatus                         | PRIMARY       | 8       | solscsm_main.rentalrequ1_.rental_id        |      1 | Using where                 | 
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+

如您所见,使用distinct时会添加Using temporary

我已经在where子句中使用了所有字段的索引。 我可以做些什么来优化这个查询?

非常感谢!

编辑:我尝试按照建议在this_.id上订购,查询速度慢了5倍。这是解释计划:

+----+-------------+--------------+------+-----------------------------------------------------+---------------------------------------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table        | type | possible_keys                                       | key                                   | key_len | ref                          | rows   | Extra                                        |
+----+-------------+--------------+------+-----------------------------------------------------+---------------------------------------+---------+------------------------------+--------+----------------------------------------------+
|  1 | SIMPLE      | this_        | ref  | PRIMARY,index_billingStatus                         | index_billingStatus                   | 5       | const                        | 782348 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | rentalrequ1_ | ref  | PRIMARY,fk_rental_id_BikeRentalRequest              | fk_rental_id_BikeRentalRequest        | 9       | solscsm_main.this_.id        |      1 | Using where; Using index; Distinct           | 
|  1 | SIMPLE      | rentalsegm2_ | ref  | index_endDate,fk_rentalRequest_id_BikeRentalSegment | fk_rentalRequest_id_BikeRentalSegment | 8       | solscsm_main.rentalrequ1_.id |      1 | Using where; Distinct                        | 
+----+-------------+--------------+------+-----------------------------------------------------+---------------------------------------+---------+------------------------------+--------+----------------------------------------------+

3 个答案:

答案 0 :(得分:2)

没有distinct的查询运行得更快的原因是因为你有一个limit子句。没有区别,服务器只需要查看前100个匹配项。但是,其中一些行可能有重复的字段,因此如果引入distinct子句,服务器必须查看更多行才能找到没有重复值的行。

BTW,你为什么要使用OUTER JOIN?

答案 1 :(得分:2)

  1. 从执行计划中我们看到优化器足够聪明,可以理解这里不需要OUTER JOIN。无论如何,你应该更明确地指明它。
  2. DISTINCT修饰符表示您想要SELECT部分​​中的所有字段,即所有指定字段的ORDER BY,然后丢弃重复项。换句话说,order by rentalsegm2_.id asc子句在这里没有任何意义。
  3. 以下查询应返回等效结果:

    select distinct this_.id as y0_
    from Rental this_
        join RentalRequest rentalrequ1_ 
          on this_.id=rentalrequ1_.rental_id
        join RentalSegment rentalsegm2_ 
          on rentalrequ1_.id=rentalsegm2_.rentalRequest_id
    where
        this_.DTYPE='B'
        and this_.id<=1848978
        and this_.billingStatus=1
        and rentalsegm2_.endDate between 1273631699529 and 1274927699529
    limit 0, 100;
    

    <强> UPD

    如果您希望执行计划以RentalSegment开头,则需要将以下索引添加到数据库中:

    1. RentalSegment(endDate)
    2. RentalRequest(id,rental_id)
    3. 租借(id,DTYPE,billingStatus)或(id,billingStatus,DTYPE)
    4. 然后可以将查询重写为以下内容:

      SELECT this_.id as y0_
      FROM RentalSegment rs
          JOIN RentalRequest rr
          JOIN Rental this_
      WHERE rs.endDate between 1273631699529 and 1274927699529
          AND rs.rentalRequest_id = rr.id
          AND rr.rental_id <= 1848978
          AND rr.rental_id = this_.id
          AND this_.DTYPE='D'
          AND this_.billingStatus = 1
      GROUP BY this_.id
      LIMIT 0, 100;
      

      如果执行计划不是从RentalSegment开始,您可以强制使用STRAIGHT_JOIN

答案 2 :(得分:1)

这里对于“rentalsegm2_”表,优化器选择了“index_endDate”索引,并且该表中预期的行数约为4.5万亿。由于存在其他条件,您可以检查“this_”表索引。我的意思是你可以在“this_ table”中查看每个条件受影响的记录数量。

总之,您可以通过更改优化程序使用的索引来尝试替代解决方案。 这可以通过“USE INDEX”,“FORCE INDEX”命令获得。

由于

Rinson KE DBA www.qburst.com