Question

假设我有一张桌子：

HH  SLOT  RN
--------------
 1     1  null
 1     2  null
 1     3  null
--------------
 2     1  null
 2     2  null
 2     3  null

我想将RN设置为1到10之间的随机数。可以在整个表格中重复该数字，但错误重复中的数字任何给定的HH。。E.g，：

HH  SLOT  RN_GOOD  RN_BAD
--------------------------
 1     1        9       3
 1     2        4       8
 1     3        7       3  <--!!!
--------------------------
 2     1        2       1
 2     2        4       6
 2     3        9       4

如果它有任何区别，这是在Netezza上。这个人对我来说是一个真正的头脑风暴。提前谢谢！

Answer 1

要获得介于1和hh中行数之间的随机数，您可以使用：

select hh, slot, row_number() over (partition by hh order by random()) as rn
from t;

更大范围的值更具挑战性。以下计算一个表（称为randoms），其中包含数字和相同范围内的随机位置。然后使用slot索引位置并从randoms表中提取随机数：

with nums as (
      select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
      select 6 union all select 7 union all select 8 union all select 9
     ),
     randoms as (
      select n, row_number() over (order by random()) as pos
      from nums
     )
select t.hh, t.slot, hnum.n
from (select hh, randoms.n, randoms.pos
      from (select distinct hh
            from t
           ) t cross join
           randoms
     ) hnum join
     t
     on t.hh = hnum.hh and
        t.slot = hnum.pos;

Here是一个SQLFiddle，在Postgres中演示了这一点，我认为它与Netezza足够接近，具有匹配的语法。

Answer 2

我不是SQL的专家，但可能会做这样的事情：

初始化计数器CNT = 1
创建一个表，以便从每个组中随机抽样1行，并计算空RN的计数，例如C_NULL_RN。
每行概率为C_NULL_RN /（10-CNT + 1），将CNT指定为RN
增加CNT并转到第2步

Answer 3

好吧，我无法得到一个光滑的解决方案，所以我做了一个黑客：

创建了一个名为rand_inst的新整数字段。
为每个空插槽分配一个随机数。
将rand_inst更新为此家庭中该随机数的实例编号。例如，如果我得到两个3，那么第二个3将rand_inst设置为2.
更新表格，在rand_inst>1。
重复分配和更新，直到我们收集解决方案。

这就是它的样子。懒得把它匿名化，所以名字和我原来的帖子有点不同：

/* Iterative hack to fill 6 slots with a random number between 1 and 13.
   A random number *must not* repeat within a household_id.
*/
update c3_lalfinal a
set a.rand_inst = b.rnum
from (
    select household_id
          ,slot_nbr
          ,row_number() over (partition by household_id,rnd order by null) as rnum
    from c3_lalfinal
) b
where a.household_id = b.household_id
  and a.slot_nbr = b.slot_nbr
;

update c3_lalfinal
set rnd = CAST(0.5 + random() * (13-1+1) as INT)
where rand_inst>1
;

/* Repeat until this query returns 0: */
select count(*) from (
  select household_id from c3_lalfinal group by 1 having count(distinct(rnd)) <> 6
) x
;

不在组内重复的SQL随机数

3 个答案: