我有一个包含超过100万条记录的表,我想从这个表中选择随机行,但不是在所有记录中选择 - 只从匹配特定条件的结果中选择随机行。
性能非常重要,所以我不能使用NEWID订购然后选择第一项。
表结构是这样的:
ID BIGINT
Title NVARCHAR(100)
Level INT
Point INT
现在,我写了一个类似的查询:
with
tmp_one as
(
SELECT
R.Id as RID
FROM [User] as U
Inner Join
[Item] as R
On R.UserId = U.Id
WHERE ([R].[Level] BETWEEN @MinLevel AND @MaxLevel)
AND ((ABS((BINARY_CHECKSUM(NEWID(),R.Id,NEWID())))% 10000)/100 ) > @RangeOne
),
tmp_two as
(
Select tmp_one.RID as RID
From tmp_one
Where ((ABS((BINARY_CHECKSUM(NEWID(),RID,NEWID())))% 10000)/100 ) > @RangeTwo
),
tmp_three as
(
Select RID as RID
From tmp_two
Where ((ABS((BINARY_CHECKSUM(NEWID(),NEWID())))% 10000)/100 ) < @RangeThree
)
Select top 10 RID
From tmp_three
我试图随机选择10项,然后选择其中一项,但我有一个惊人的问题!!!
有时输出按项目级别排序!而且我不想要它(它不是真的随机)。我真的不知道结果是按级别排序的。
请建议一些解决方案,帮助我选择高性能的随机记录,并在高范围的迭代中随机选择不重复。
答案 0 :(得分:1)
基于MSDN的Selecting Rows Randomly from a Large Table,而不是您避免的那个:
select top 10 * from TableName order by newid()
它暗示了这一点:
select top 10 * from TableName where (abs(cast((binary_checksum(*) * rand()) as int)) % 100) < 10
它只有更小的逻辑读取性能。
答案 1 :(得分:-1)
尝试这样的事情。它将从您的表中随机抓取10行。
这是伪代码,因此您可能需要修复几个列名以匹配您的真实表。
DECLARE @Random int
DECLARE @Result table
(ID BIGINT,
Title varchar(100),
Level int,
Point int)
declare @TotalRows int
set @TotalRows = (select COUNT(*) From [User] U inner join [Item] R on R.UserID = U.ID)
while (select COUNT(*) from @Result)<10
begin
set @Random = (select floor(RAND() * @TotalRows+1))
insert into @Result
select T1.ID, T1.Title, T1.Level, T1.Point from
(select top (@Random) * From [User] U inner join [Item] R on R.UserID = U.ID) T1
left outer join (select top (@Random) * From [User] U inner join [Item] R on R.UserID = U.ID) T2 on T2.ID = T1.ID
where T2.ID is null
end
select * from @Result
以下是它的工作原理。
Select a random number. For example 47.
We want to select the 47th row of the table.
Select the top 47 rows, call it T1.
Join it to the top 46 rows called T2.
The row where T2 is null is the 47th row.
Insert that into a temporary table.
Do it until there are 10 rows.
Done.