Question

给出一个像这样的字符串列表（实际上我有一个更长的列表，但我会在这里保持简短）：

items=['fish','headphones','wineglass','bowtie','cheese','hammer','socks']

我想随机选择此列表中的一个子集，例如3，这样项目只能被选中一次。使用以下内容很容易：

import itertools
import random
def random_combination(iterable, r):
    "Random selection from itertools.combinations(iterable, r)"
    pool = tuple(iterable)
    n = len(pool)
    indices = sorted(random.sample(xrange(n), r))
    return tuple(pool[i] for i in indices)

items=['fish','headphones','wineglass','bowtie','cheese','hammer','socks']
randomPick=random_combination(items,3)

接下来，为了感到痛苦，我不想只做一次这样的事，但有几次要说10次。最终产品将是10个随机选择一次的项目列表，其约束条件是在这10个列表中项目在列表中的呈现次数相等。我想避免袜子＆＃34;袜子＆＃34;被捡起来10次＆＃34;锤子＆＃34;例如，只有一次。

这是我一直坚持的步骤，我根本不知道足够的编程或足够了解python中的可用函数来执行这样的事情。

有人可以帮忙吗？

Answer 1

以下代码可能有所帮助。它弹出一个随机元素，直到iterable的（副本）为空，然后从整个列表开始。缺点是在第二次选择单个项目之前，每个项目都会被选中一次。但是，正如您从输出中看到的那样，项目的分布最终大致相等。

import random

def equal_distribution_combinations(iterable, n, csize):
    """
    Yield 'n' lists of size 'csize' containing distinct random elements
    from 'iterable.' Elements of 'iterable' are approximately evenly
    distributed across all yielded combinations.
    """
    i_copy = list(iterable)

    if csize > len(i_copy):
        raise ValueError(
            "csize cannot exceed len(iterable), as elements could not distinct."
        )

    for i in range(n):
        comb = []
        for j in range(csize):
            if not i_copy:
                i_copy = list(iterable)

            randi = random.randint(0, len(i_copy) - 1)

            # If i_coppy was reinstantiated it would be possible to have
            # duplicate elements in comb without this check.
            while i_copy[randi] in comb:
                randi = random.randint(0, len(i_copy) - 1)
            comb.append(i_copy.pop(randi))

        yield comb

修改

Python的道歉3. Python 2函数的唯一变化应该是range - ＆gt; xrange。

编辑2（回答评论问题）

只要equal_distribution_combinations不超过{{{}，n就会导致任何csize，iterable和csize的长度均匀分布1}}（因为组合元素不能区分）。

使用评论中的具体数字进行测试：

len(iterable)

输出：

items = range(30)
item_counts = {k: 0 for k in items}

for comb in equal_distribution_combinations(items, 10, 10):
    print(comb)
    for e in comb:
        item_counts[e] += 1

print('')
for k, v in item_counts.items():
    print('Item: {0}  Count: {1}'.format(k, v))

可以看出，这些项目是均匀分布的。

Answer 2

我会做这样的事情：

items = set(items)
res = []
for _ in xrange(10):
    r = random.sample(items, 3)
    res.append(r)
    items -= set(r)

所有这一切都是抓取3个元素，存储它们，然后从原始列表中减去它们，这样它们就不能再被选中了。

Answer 3

好的，最后我采取了以下措施。这是一个更有限的实现，我设置了我想看到一个项目重复的次数，例如在10个列表中我希望每个项目被挑选5次：

List = ['airplane',
            'fish',
            'watch',
            'balloon',
            'headphones',
            'wineglass',
            'bowtie',
            'guitar',
            'desk',
            'bottle',
            'glove'] #there is more in my final list but keeping it short here
numIters = 5
numItems = len(List)
finalList=[]
for curList in range(numIters):
    random.shuffle(List)
    finalList.append(List[0 : numItems/2]) #append first list
    finalList.append(List[numItems/2 : -1]) #append second list

return finalList

从列表中随机选择子集并在python中保持相同数量的选择

3 个答案:

修改

编辑2（回答评论问题）