我需要能够显示36个nCr 10中的1200个随机组合。由于36个nCr 10中有254,186,856个组合,我想我将无法将所有这些组合放在Python列表中。
我该如何解决这个问题?我应该使用除Python之外的其他东西,还是寻找不同的算法? (我现在正在使用这个:http://docs.python.org/library/itertools.html?highlight=itertools.combinations#itertools.combinations)
编辑:组合不能重复,因为它不再是nCr问题。我以为我会澄清一下。
到目前为止,这是代码......
def combinations(iterable, r):
# combinations('ABCD', 2) --> AB AC AD BC BD CD
# combinations(range(4), 3) --> 012 013 023 123
pool = tuple(iterable)
n = len(pool)
if r > n:
return
indices = range(r)
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
if __name__ == '__main__':
teamList = list(combinations(range(36), 10))
之后,Python使用2 GB以上的RAM但似乎永远不会完成计算。
答案 0 :(得分:2)
我是不是在想这个?
from random import sample
dataset = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
for i in xrange(1200):
print sample(dataset,10)
答案 1 :(得分:1)
你不能直接在组合迭代器上使用random.sample,但你可以用它来创建随机索引:
indices = random.sample(xrange(number_of_combinations), 1200)
comb = itertools.combinations(range(36), 10)
prev = 0
for n in sorted(indices):
print next(itertools.islice(comb, n-prev, None))
prev = n
random.sample只会选择每个索引一次,因此您不必担心重复。此外,它不需要生成254,186,856个索引来选择其中的1200个。
如果你有SciPy,你可以使用scipy.misc.comb轻松计算组合数,这是快速有效的计算方法:
number_of_combinations = scipy.misc.comb(36, 10, exact=True)
否则您可以使用this snippet:
def number_of_combinations(n, k):
if k < 0 or k > n:
return 0
if k > n - k: # take advantage of symmetry
k = n - k
c = 1
for i in range(k):
c = c * (n - (k - (i+1)))
c = c // (i+1)
return c
答案 2 :(得分:0)
尝试以下实施。
>>> def nCr(data,r,size):
result=set()
while len(result) < size:
result.add(''.join(random.sample(data,r)))
return list(result)
为了给长度为36的给定数据生成1200个 36 C r 的独特样本,你可以做类似的事情
>>> data = string.ascii_letters[:36]
>>> print nCr(data,10,1200)
答案 3 :(得分:0)
您可以使用迭代整个序列的技术,但稍微增加每个步骤拾取元素的概率。这将产生一个无偏的随机样本,1200(取决于您选择的概率)元素。
它会强制您生成或多或少的整个序列,但您不必将元素放入内存中。参见例如: http://www.javamex.com/tutorials/random_numbers/random_sample.shtml
答案 4 :(得分:0)
我认为这会给你你想要的答案:
我正在迭代包含所有可能组合的生成器,并随机挑选出n个。
from itertools import combinations
import random as rn
from math import factorial
import time
def choose_combos(n,r,n_chosen):
total_combs = factorial(n)/(factorial(n-r)*factorial(r))
combos = combinations(range(n),r)
chosen_indexes = rn.sample(xrange(total_combs),n_chosen)
random_combos = []
for i in xrange(total_combs):
ele = combos.next()
if i in chosen_indexes:
random_combos.append(ele)
return random_combos
start_time = time.time()
print choose_combos(36,10,5) #you would have to use 1200 instead of 5
print 'time taken', time.time() - start_time
这需要一些时间,但并不是疯了:
[(0, 6, 10, 13, 22, 27, 28, 29, 30, 35), (3, 4, 11, 12, 13, 16, 19, 26, 31, 33), (3, 7, 8, 9, 11, 19, 20, 23, 28, 30), (3, 10, 15, 19, 23, 24, 29, 30, 33, 35), (7, 14, 16, 20, 22, 25, 29, 30, 33, 35)]
time taken 111.286000013
答案 5 :(得分:0)
看起来是我最近回答another question
的应用<子>名词子> C <子>ķ子>:
def choose(n, k):
'''Returns the number of ways to choose k items from n items'''
reflect = n - k
if k > reflect:
if k > n:
return 0
k = reflect
if k == 0:
return 1
for nMinusIPlus1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
n = n * nMinusIPlus1 // i
return n
n C k 的所有组合的按字典顺序排列的索引中的组合:
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
nCk = 1
for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
nCk *= nMinusI
nCk //= iPlus1
curIndex = nCk
for k in range(k, 0, -1):
nCk *= k
nCk //= n
while curIndex - nCk > index:
curIndex -= nCk
nCk *= (n - k)
nCk -= nCk % k
n -= 1
nCk //= n
n -= 1
yield n
随机的组合样本,无需创建组合列表:
import random
def iterRandomSampleOfCombinations(items, combinationSize, sampleSize):
'''Yields a random sample of sampleSize combinations, each composed of
combinationSize elements chosen from items.
The sample is as per random.sample, thus any sub-slice will also be a valid
random sample.
Each combination will be a reverse ordered list of items - one could reverse
them or shuffle them post yield if need be.
'''
n = len(items)
if n < 1 or combinationSize < 1 or combinationSize > n:
return
nCk = choose(n, combinationSize)
if sampleSize > nCk:
return
for sample in random.sample(range(nCk), sampleSize):
yield [items[i] for i in iterCombination(sample, n, combinationSize)]
示例,从36个项目[A-Z] + [a-j]中选择的29个长度为10的组合的样本:
>>> items = [chr(i) for i in range(65, 91)] + [chr(i) for i in range(97, 107)]
>>> len(items)
36
>>> for combination in combinations.iterRandomSampleOfCombinations(items, 10, 29):
... sampledCombination
...
['i', 'e', 'b', 'Z', 'U', 'Q', 'N', 'M', 'H', 'A']
['j', 'i', 'h', 'g', 'f', 'Z', 'P', 'I', 'G', 'E']
['e', 'a', 'Z', 'U', 'Q', 'L', 'G', 'F', 'C', 'B']
['i', 'h', 'f', 'Y', 'X', 'W', 'V', 'P', 'I', 'H']
['g', 'Y', 'V', 'S', 'R', 'N', 'M', 'L', 'K', 'I']
['j', 'i', 'f', 'e', 'd', 'b', 'Z', 'X', 'W', 'L']
['g', 'f', 'e', 'Z', 'T', 'S', 'P', 'L', 'J', 'E']
['d', 'c', 'Z', 'X', 'V', 'U', 'S', 'I', 'H', 'C']
['f', 'e', 'Y', 'U', 'I', 'H', 'D', 'C', 'B', 'A']
['j', 'd', 'b', 'W', 'Q', 'P', 'N', 'M', 'F', 'B']
['j', 'a', 'V', 'S', 'P', 'N', 'L', 'J', 'H', 'G']
['g', 'f', 'e', 'a', 'W', 'V', 'O', 'N', 'J', 'D']
['a', 'Z', 'Y', 'W', 'Q', 'O', 'N', 'F', 'B', 'A']
['i', 'g', 'a', 'X', 'V', 'S', 'Q', 'P', 'H', 'D']
['c', 'b', 'a', 'T', 'P', 'O', 'M', 'I', 'D', 'B']
['i', 'f', 'b', 'Y', 'X', 'W', 'V', 'U', 'M', 'A']
['j', 'd', 'U', 'T', 'S', 'K', 'G', 'F', 'C', 'B']
['c', 'Z', 'X', 'U', 'T', 'S', 'O', 'M', 'F', 'D']
['g', 'f', 'X', 'S', 'P', 'M', 'F', 'D', 'C', 'B']
['f', 'Y', 'W', 'T', 'P', 'M', 'J', 'H', 'D', 'C']
['h', 'b', 'Y', 'X', 'W', 'Q', 'K', 'F', 'C', 'B']
['j', 'g', 'Z', 'Y', 'T', 'O', 'L', 'G', 'E', 'D']
['h', 'Z', 'Y', 'S', 'R', 'Q', 'H', 'G', 'F', 'E']
['i', 'c', 'X', 'V', 'R', 'P', 'N', 'L', 'J', 'E']
['f', 'b', 'Z', 'Y', 'W', 'V', 'Q', 'N', 'G', 'D']
['f', 'd', 'c', 'b', 'V', 'T', 'S', 'R', 'Q', 'B']
['i', 'd', 'W', 'U', 'S', 'O', 'N', 'M', 'K', 'G']
['g', 'f', 'a', 'W', 'V', 'T', 'S', 'R', 'H', 'B']
['g', 'f', 'a', 'W', 'T', 'S', 'O', 'L', 'K', 'G']
注意:组合本身是(反向)排序的,但是样本是随机的(因为random.sample
对象上使用range
以来的任何片段),如果还需要随机顺序只需执行random.shuffle(combination)
。
它也很快(虽然组合的不同顺序可能更快?):
>>> samples = 1000
>>> sampleSize = 1200
>>> combinationSize = 10
>>> len(items)
36
>>> while 1:
... start = time.clock()
... for i in range(samples):
... for combination in iterRandomSampleOfCombinations(items, combinationSize, sampleSize):
... pass
... end = time.clock()
... print("{0} seconds per length {1} sample of {2}C{3}".format((end - start)/samples, sampleSize, len(items), combinationSize))
... break
...
0.03162827446371375 seconds per length 1200 sample of 36C10