如何在MySQL中有效地选择随机记录?

时间:2010-04-25 09:05:07

标签: mysql sql-optimization

mysql> EXPLAIN SELECT * FROM urls ORDER BY RAND() LIMIT 1;
+----+-------------+-------+------+---------------+------+---------+------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+-------+---------------------------------+
|  1 | SIMPLE      | urls  | ALL  | NULL          | NULL | NULL    | NULL | 62228 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+-------+---------------------------------+

以上不符合效率,我该如何正确地做到这一点?

更新

似乎使用答案中提到的解决方案仍无济于事:

mysql> explain SELECT  *
    -> FROM    (
    ->         SELECT  @cnt := COUNT(*) + 1,
    ->                 @lim := 10
    ->         FROM    urls
    ->         ) vars
    -> STRAIGHT_JOIN
    ->         (
    ->         SELECT  r.*,
    ->                 @lim := @lim - 1
    ->         FROM    urls r
    ->         WHERE   (@cnt := @cnt - 1)
    ->                 AND RAND(20090301) < @lim / @cnt
    ->         ) i;
+----+-------------+------------+--------+---------------+------+---------+------+-------+------------------------------+
| id | select_type | table      | type   | possible_keys | key  | key_len | ref  | rows  | Extra                        |
+----+-------------+------------+--------+---------------+------+---------+------+-------+------------------------------+
|  1 | PRIMARY     | <derived2> | system | NULL          | NULL | NULL    | NULL |     1 |                              |
|  1 | PRIMARY     | <derived3> | ALL    | NULL          | NULL | NULL    | NULL |    10 |                              |
|  3 | DERIVED     | r          | ALL    | NULL          | NULL | NULL    | NULL | 62228 | Using where                  |
|  2 | DERIVED     | NULL       | NULL   | NULL          | NULL | NULL    | NULL |  NULL | Select tables optimized away |
+----+-------------+------------+--------+---------------+------+---------+------+-------+------------------------------+

2 个答案:

答案 0 :(得分:4)

Quassnoi撰写了post关于随机选择行而不执行排序的问题。他的例子随机选择10行,但你可以调整它只选择一行。

如果你希望它真的快,那么你可以使用一个不完全统一的近似值,或者有时候无法返回一行。

您还可以使用存储过程从Bill Karwin's post快速选择随机行:

SET @r := (SELECT ROUND(RAND() * (SELECT COUNT(*) FROM mytable)));
SET @sql := CONCAT('SELECT * FROM mytable LIMIT ', @r, ', 1');
PREPARE stmt1 FROM @sql;
EXECUTE stmt1;

请注意,这在MyISAM中的运行速度要比InnoDB快得多,因为在InnoDB中COUNT(*)很昂贵,但在MyISAM中几乎是即时的。

答案 1 :(得分:0)

好吧,如果你可以将一些逻辑移到应用层(我并没有误解你的问题),那么你需要的只是在应用程序中生成随机ID,然后对该键所标识的一条记录执行简单选择。您需要知道的只是记录数。哦,如果该密钥被删除,请下一个。