Question

我有一个数组，其中包含一组数字n次。 n=2的示例：

[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

我想要的是这个数组的分区，其中分区的成员

包含从数组中随机绘制的元素
不包含重复项
包含相同数量的元素（最多为四舍五入）k

k=4的示例输出：

[[3,0,2,1], [0,1,4,2], [3,4]]

k=4的输出无效：

[[3,0,2,2], [3,1,4,0], [1,4]]

（这是一个分区，但分区的第一个元素包含重复项）

实现这一目标的最佳方式是什么？

Answer 1

可以使用collections.Counter和random.sample的组合：

from collections import Counter
import random

def random_partition(seq, k):
    cnts = Counter(seq)
    # as long as there are enough items to "sample" take a random sample
    while len(cnts) >= k:
        sample = random.sample(list(cnts), k)
        cnts -= Counter(sample)
        yield sample

    # Fewer different items than the sample size, just return the unique
    # items until the Counter is empty
    while cnts:
        sample = list(cnts)
        cnts -= Counter(sample)
        yield sample

这是yield样本的生成器，因此您只需将其转换为list：

>>> l = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

>>> list(random_partition(l, 4))
[[1, 0, 2, 4], [1, 0, 2, 3], [3, 4]]

>>> list(random_partition(l, 2))
[[1, 0], [3, 0], [1, 4], [2, 3], [4, 2]]

>>> list(random_partition(l, 6))
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

>>> list(random_partition(l, 4))
[[4, 1, 0, 3], [1, 3, 4, 0], [2], [2]]

最后一个案例表明，如果＆＃34;随机＆＃34;这个方法可以给出奇怪的结果。函数中的部分返回＆＃34;错误＆＃34;样本。如果这不应该发生或者至少不经常发生，你需要弄清楚如何对样本进行加权（例如使用random.choices）以最小化这种可能性。

随机分区列表没有重复

1 个答案: