Question

我正在尝试在k范围内找到1..n个随机数，这样k个数字都不是连续的。我想出的代码是

def noncontiguoussample(n,k):
    import random
    numbers = range(n)
    samples = []
    for _ in range(k):
        v = random.choice(numbers)
        samples.append(v)
        for v in range(v-1, v+2):
            try:
                numbers.remove(v)
            except ValueError:
                pass

    return samples

更新：我知道这个函数不会以均匀的概率返回样本。基于我的有限测试，下面的Amber解决方案满足条件（a）样本的各个元素是非连续的，以及（b）所有可能的k个样本（来自1 ... n）都是以均匀的概率生成的。

Answer 1

如果您使用set，则代码更简单。

import random

def noncontiguoussample(n,k):
    numbers = set(range(1,n+1))
    samples = []
    for _ in range(k):
        v = random.choice(list(numbers))
        samples.append(v)
        numbers -= set([v-1, v, v+1])
    return samples

然而，正如Michael Anderson在评论中指出的那样，在n < 3*k的情况下，此算法有时会失败。

一个不会失败的更好的算法（也更快！）可能如下所示：

import random

def noncontiguoussample(n,k):
    # How many numbers we're not picking
    total_skips = n - k

    # Distribute the additional skips across the range
    skip_cutoffs = random.sample(range(total_skips+1), k)
    skip_cutoffs.sort()

    # Construct the final set of numbers based on our skip distribution
    samples = []
    for index, skip_spot in enumerate(skip_cutoffs):
        # This is just some math-fu that translates indices within the
        # skips to values in the overall result.
        samples.append(1 + index + skip_spot)

    return samples

最后的数学运算是这样的：

1，我们可以选择的最小值
我们已挑选的每个号码加1（index），以计算所选号码
加上我们在跳过中的位置（总是会增加至少一个）

因此，对于循环中的每次迭代，结果总是会增加至少2。

Answer 2

这是一个不会失败的无偏见版本。（但比Ambers解决方案慢）。如果你给它一个没有解决方案的案例，它将永远循环（但那可以解决）。

#A function to check if the given set is OK
def is_valid_choice( s ):
  for x in s:
    if x-1 in s or x+1 in s:
      return False
  return True

#The real function
def noncontiguoussample(n,k):
  while True:
    s = random.sample(xrange(1,n+1),k)
    if is_valid_choice(s):
      return s

在1..n范围内找到具有均匀概率的k个非连续随机数

2 个答案: