从列表列表中获取所有独特的组合,直到第n个组合

时间:2017-11-08 15:34:15

标签: python algorithm list

我有一个列表列表,其中每个内部列表中的变量是图像的路径。通常每个内部列表的长度约为35,并且列表中将有9个这样的列表。 e.g

 [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]]

我想从此列表列表中生成9的唯一组合。如果我使用itertools.product它可以工作,但需要太长时间 - 它会崩溃我的电脑。我需要的是能够持续达到n组合的东西,其中n可能大约为200.我已经尝试过了......

  list(itertools.product(*z))[:200]

其中z是我的列表列表,但它不起作用,因为它在执行切片之前首先生成所有组合(太慢)。

还有其他有效的方法可以运行吗?

编辑:我应该补充一点,我需要将其转换为列表列表...

基准:

布拉德:

 def combos():
     my_iter = itertools.product(*z)
     print([next(my_iter) for i in range(1, 10000)]

cProfile.run('print(combos())')
     10006 function calls in 0.021 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    0.021    0.021 <string>:1(<module>)
    1    0.000    0.000    0.021    0.021 gcm.py:17(combos)
    1    0.002    0.002    0.004    0.004 gcm.py:19(<listcomp>)
    1    0.000    0.000    0.021    0.021 {built-in method builtins.exec}
 9999    0.003    0.000    0.003    0.000 {built-in method builtins.next}
    2    0.016    0.008    0.016    0.008 {built-in method builtins.print}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

DeepSpaces更新了答案:

 cProfile.run('print(list(my_gen(z, 10000)))')

 10005 function calls in 0.019 seconds

 Ordered by: standard name

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.001    0.001    0.019    0.019 <string>:1(<module>)
10001    0.004    0.000    0.004    0.000 gcm.py:10(my_gen)
    1    0.000    0.000    0.019    0.019 {built-in method builtins.exec}
    1    0.014    0.014    0.014    0.014 {built-in method builtins.print}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

为了迈出这一步,我做到了......

   def my_gen(z, limit):
      count = 0
      for i in itertools.product(*z):
         if count < limit:
            if count % 1000 == 0:
               yield i
               count += 1
            else:
               count += 1
               continue
      else:
        raise StopIteration

3 个答案:

答案 0 :(得分:2)

试试这个:

my_iter = itertools.product(*z)
[next(my_iter) for i in range(200)]

答案 1 :(得分:1)

不是将itertools.product(*z)转换为列表并进行切片,而是使用您自己的生成器进行包装:

def my_gen(z, limit):
    count = 0
    for i in itertools.product(*z):
        if count < limit:
            yield i
            count += 1
        else:
            raise StopIteration

编辑布拉德的答案显示了一个类似的想法,但仍会在内存中创建一个包含200个元素的列表,而我的方法不会被激活(除非在{{1}内调用})。

编辑2

list(...)

答案 2 :(得分:0)

以前的答案都有好有坏。将itertools.product包装在您自己的生成器中更好的是每次都必须使用列表推导但不提高StopIteration会使其在itertools.product引发它之前运行所有循环。

def capped_product(l, limit):
    i = itertools.product(*l)
    for _ in range(limit):
        yield next(i)
    raise StopIteration

list(capped_product(z, 200))

另一种方法是使用生成器以递增方式提供所需大小的列表。例如:

z = [[1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5]]
n = 3

# First time you run it
first = [(1,1,1), (1,1,2), (1,1,3)]
# Second time you run it
second = [(1,1,4), (1,1,5), (1,2,1)]
# Third time you run it
third = [(1,2,2), (1,2,3), (1,2,4)]
# ...

每次获得列表时,您都会获得以下n个列表z的组合。这可以完成:

def split_product(l, size):
    out = []
    for i in itertool.product(*l):
        out.append(i)
        if len(out) == size:
            yield out
            out = []

z = [[i for i in range(1, 36)] for _ in range(9)]
i = split_product(z, 200)
first = next(i)
second = next(i)
# ...