Redshift选择随机记录,但避免重复

时间:2018-04-05 10:26:45

标签: sql amazon-web-services amazon-redshift

我在Redshift中有一个表,我有以下记录的样本ID 71082:

id       trm_num        start_time
71082   PCMAMGA759551   2012-05-02 09:41:54
71082   PCMAMGA759551   2015-06-02 13:23:39
71082   PCMAMGA759551   2015-09-03 13:23:39
71082   PCMAMGA759551   2015-12-11 07:25:25
71082   PCMAMGA759551   2017-01-10 09:03:22

我想为每个id只选择1个随机记录。 为此,我尝试了查询:

select * from mytable where id=71082 order by random limit 1;

它取了我的随机记录。但是这个表有1000个不同的ID。如何修改我对其他ID的查询?

1 个答案:

答案 0 :(得分:3)

使用窗口函数ROW_NUMBER和每个ID的随机顺序:

select id, trm_num, start_time
from
(
  select
    id, trm_num, start_time,
    row_number() over (partition by id order by random()) as rn
  from mytable
) numbered
where rn = 1;