Question

我一直在为一个分页系统的查询性能搞乱，以尽可能快地选择数据，但我发现了一些我不太了解的东西。据我所知，当使用带偏移的限制时，MySQL必须在偏移之前迭代每一行然后丢弃它们，所以理论上一个偏移量为10,000的查询比没有偏移量的查询慢得多，这通常是正确的在这种情况下

select SQL_NO_CACHE * from `customers` where `NetworkID`='\func uuid()' 
    order by `DateTimeAdded` desc limit 0, 100;
/* finishes in 2.497 seconds */

 select SQL_NO_CACHE * from `customers` where `NetworkID`='\func uuid()' 
   order by `DateTimeAdded` desc limit 10000, 100;
 /* finishes in 2.702 seconds */

但是，如果我使用内部联接将表连接到自身只有UserID列来进行排序和限制，那么它的偏移量始终更快 10,000比没有，这完全困扰我。这里的例子是

select SQL_NO_CACHE * from `customers` 
    inner join (select `UserID` from `customers` where `NetworkID`='\func uuid()' 
        order by `DateTimeAdded` desc limit 100) 
    as `Results` using(`UserID`)
/* finishes in 1.133 seconds */

select SQL_NO_CACHE * from `customers` 
    inner join (select `UserID` from `customers` where `NetworkID`='\func uuid()' 
        order by `DateTimeAdded` desc limit 10000, 100) 
    as `Results` using(`UserID`)
/* finishes in 1.120 seconds */

为什么使用偏移量的查询总是比没有偏移量的查询更快？

说明：

我已在此处发布了包含explains内容here

的Google文档电子表格

_{注意：上述测试是在PHP循环中完成的，每次循环}

^{注意²：customers是一个视图，而不是基表}

Answer 1

案例1：优化器可以使用ORDER BY上的索引。 LIMIT 10将比LIMIT 10000,10更快，因为它可以更快地停止阅读行。

案例2：优化器不能（或选择不）使用ORDER BY的索引。在这种情况下，收集整个行集（在WHERE之后），对该集进行排序，然后才应用OFFSET和LIMIT。在这种情况下，OFFSET的值几乎没有差别;大部分时间都是消耗取出行，过滤它们并对它们进行排序。

INDEX(x,y)
SELECT ... WHERE x=2               ORDER BY y LIMIT ... -- case 1
SELECT ... WHERE x=2 AND deleted=0 ORDER BY y LIMIT ... -- case 2

INDEX(NetworkID, DateTimeAdded)         -- composite
SELECT ... WHERE NetworkID='...' ORDER BY DateTimeAdded DESC ... -- Case 1

INDEX(NetworkID), INDEX(DateTimeAdded)  -- separate
SELECT ... WHERE NetworkID='...' ORDER BY DateTimeAdded DESC ... -- Case 3

案例3可能与案例1类似，因为可能使用INDEX(DateTimeAdded)。或者，优化器选择使用另一个索引，那么它是一个慢的Case 2.无论如何，它不如使用可以处理WHERE和ORDER BY的复合索引一样好。

如果你可以设法进入案例1，我建议你“记住你离开的地方”，以使分页更有效率。请参阅my Pagination blog。

带偏移的MySQL选择比没有偏移更快

1 个答案: