我在Redshift中有一个表,我有以下记录的样本ID 71082:
id trm_num start_time
71082 PCMAMGA759551 2012-05-02 09:41:54
71082 PCMAMGA759551 2015-06-02 13:23:39
71082 PCMAMGA759551 2015-09-03 13:23:39
71082 PCMAMGA759551 2015-12-11 07:25:25
71082 PCMAMGA759551 2017-01-10 09:03:22
我想为每个id只选择1个随机记录。 为此,我尝试了查询:
select * from mytable where id=71082 order by random limit 1;
它取了我的随机记录。但是这个表有1000个不同的ID。如何修改我对其他ID的查询?
答案 0 :(得分:3)
使用窗口函数ROW_NUMBER
和每个ID的随机顺序:
select id, trm_num, start_time
from
(
select
id, trm_num, start_time,
row_number() over (partition by id order by random()) as rn
from mytable
) numbered
where rn = 1;