Question

我有一个查询，旨在从结果集中检索随机行。我不想使用ORDER BY Rand()，因为它看起来效率很低。

我的方法如下：

在[0,1）
为结果查询的每一行提供唯一的“排名”编号。即，给第一行赋值1，给第二行赋值2，依此类推
使用随机数来获得1到结果
返回rank ==从随机数生成的数字

示例查询：

SELECT * FROM(

    (SELECT @rand := RAND(), @rank := 0) r1
    CROSS JOIN
    (SELECT (@rank:=@rank+1) as num, A.id FROM
    A JOIN B
    ON A.id = B.id
    WHERE B.number = 42
)
WHERE num = FLOOR(1 + @rand * @rank) LIMIT 1

这适用于检索一行，但我想要10个随机行。将LIMIT 1更改为LIMIT 10不起作用，因为如果num + 10 > number of rows查询未返回10行。

我能想到的唯一解决方案是在sql查询中生成10个随机数，检查它们是否彼此不同并且有几行WHERE num = random_number_1行。或者，我可以调用查询10次，检查所选行是否唯一。我不知道怎么做前者，后者看起来效率很低。除非有可能会有一些很棒的缓存能够非常快地运行同一个查询吗？

有没有人有任何想法？谢谢

Answer 1

您可以尝试以下方法：

select sq2.c1 
  from  ( select * 
            from (select @count :=  0) sq0
           cross join  
                 (select t1.c1, @count := @count+1        
                    from t t1       
                    join t t2      
                   using(c1)      
                   where t2.c2 = 42    
                 ) sq1  
         ) sq2   
 --use a probability to pick random rows
 where if(@count <= 5, 1, floor(1 + rand() * (@count-1))) <= ceiling(log(pow(@count,2)))+1
 limit 5;

除非结果集的限制小于（或大小相同），否则结果将是随机的。如果这是一个问题，你可以包装整个事情：

select sq3.* from ( select ... limit 5 ) sq3 
order by rand().

这只会使少量输出行（最多5个）随机化，这是有效的。

当然，您总是可以使用临时表：

create temporary table rset (row_key int auto_increment, key(row_key))
as ( select .... where c2 = 42 ) engine=myisam;

set @count := select count(*) from rset;

select rset.c1 
  from rset 
 where row_key in (    (floor(1 + rand() * (@count-1))),
(floor(1 + rand() * (@count-1))),
(floor(1 + rand() * (@count-1))),
(floor(1 + rand() * (@count-1))),
(floor(1 + rand() * (@count-1))) );

drop table rset;

如果您想保证获得五个唯一的行，那么您可以使用第二个临时表：

create temporary table row_keys ( row_key int not null primary key );
-- do this successful five times.  if you get a unique key error try again
insert into row_keys values (floor(1 + rand() * (@count-1));

select rset.c1
  from rset
  join row_keys
  using(row_key);

从MySQL查询结果集中检索多个随机行 - 不使用rand（）的顺序

1 个答案: