我需要从(大约)230,000行生成(约)2500行的系统随机样本,每行有一个唯一的自动生成数字。
这是否可以使用Teradata SQL ast? (Sample函数生成一个简单的随机样本。)
感谢您的时间。
答案 0 :(得分:1)
select rank() over(order by $primary_index_key), t1.*
FROM
(select * from $table_name
sample 2500) t1
助理会这样做,任何其他客户也是如此。 可以使用相同的方法来生成获胜的强力球号码。
答案 1 :(得分:1)
当已经是无间隙的唯一行号时:
select t.*
from mytable as t
cross join
( select random(1,2500) as rnd ) as dt -- random start row
where rownumber mod 2500 = rnd -- every 2500 rows
否则可以使用ROW_NUMBER创建它:
select t.*
from mytable as t
cross join
( select random(1,2500) as rnd ) as dt
qualify ROW_NUMBER() OVER (ORDER BY whatever_determines_your_order) mod 2500 = rnd
答案 2 :(得分:0)
从您的评论中,我认为您无法动态执行此操作。您需要首先生成一些包含Step#和step_value的表,例如:
[1,2500],[2,5000],... [X,X * 2500]
您应该将此表连接到查询并通过逻辑限制行号:rn = random_seed + step_value。 它看起来像这样:
select
from (
select t1.*
, row_number() over("ordering logic") as rn
from my_table
) as t1
, (select random(1,2500) as random_seed) as seed --so it will be generated only once
where exists (
select 1 from sampling_table as t2
where t1.rn = t2.step_value + seed.random_seed
)