Question

从previous question我学到了一些有趣的东西。如果Python的itertools.product被赋予一系列迭代器，那么这些迭代器将在笛卡尔积开始之前转换为元组。 Related questions查看itertools.product的源代码，得出结论：虽然没有中间结果存储在内存中，但是在产品迭代开始之前创建了原始迭代器的元组版本。

问题：当（元组转换的）输入太大而无法保存在内存中时，有没有办法为笛卡尔积创建迭代器？琐碎的例子：

import itertools A = itertools.permutations(xrange(100)) itertools.product(A)

更实际的用例将采用一系列(*iterables[, repeat])，就像函数的原始实现一样 - 上面只是一个例子。它看起来不像你可以使用itertools.product的当前实现，所以我欢迎在纯python中提交（尽管你无法击败itertools的C后端！）。

Answer 1

这是一个调用callables并迭代iterables的实现，它们可以重新启动：

def product(*iterables, **kwargs):
    if len(iterables) == 0:
        yield ()
    else:
        iterables = iterables * kwargs.get('repeat', 1)
        it = iterables[0]
        for item in it() if callable(it) else iter(it):
            for items in product(*iterables[1:]):
                yield (item, ) + items

测试：

import itertools
g = product(lambda: itertools.permutations(xrange(100)),
            lambda: itertools.permutations(xrange(100)))
print next(g)
print sum(1 for _ in g)

Answer 2

没有“迭代者娱乐”，第一个因素可能是有可能的。但这样只能节省1 / n的空间（其中n是因子的数量）并增加了混乱。

所以答案是迭代者娱乐。函数的客户端必须确保迭代器的创建是纯粹的（没有副作用）。像

def iterProduct(ic):
    if not ic:
        yield []
        return

    for i in ic[0]():
        for js in iterProduct(ic[1:]):
            yield [i] + js


# Test
x3 = lambda: xrange(3)
for i in iterProduct([x3,x3,x3]):
    print i

Answer 3

使用标准Python生成器无法做到这一点，因为某些迭代必须循环多次。您必须使用某种能够“重复”的数据类型。我创建了一个简单的“可重复”类和一个非递归的产品算法。 product应该有更多的错误检查，但这至少是第一种方法。简单的可重复的课程......

class PermutationsReiterable(object):
    def __init__(self, value):
        self.value = value
    def __iter__(self):
        return itertools.permutations(xrange(self.value))

product iteslf ......

def product(*reiterables, **kwargs):
    if not reiterables:
        yield ()
        return
    reiterables *= kwargs.get('repeat', 1)

    iterables = [iter(ri) for ri in reiterables]
    try:
        states = [next(it) for it in iterables]
    except StopIteration:
        # outer product of zero-length iterable is empty
        return
    yield tuple(states)

    current_index = max_index = len(iterables) - 1
    while True:
        try:
            next_item = next(iterables[current_index])
        except StopIteration:
            if current_index > 0:
                new_iter = iter(reiterables[current_index])
                next_item = next(new_iter)
                states[current_index] = next_item
                iterables[current_index] = new_iter
                current_index -= 1
            else:
                # last iterable has run out; terminate generator
                return
        else:
            states[current_index] = next_item
            current_index = max_index
            yield tuple(states)

测试：

>>> pi2 = PermutationsReiterable(2)
>>> list(pi2); list(pi2)
[(0, 1), (1, 0)]
[(0, 1), (1, 0)]
>>> list(product(pi2, repeat=2))
[((0, 1), (0, 1)), ((0, 1), (1, 0)), ((1, 0), (0, 1)), ((1, 0), (1, 0))]
>>> giant_product = product(PermutationsReiterable(100), repeat=5)
>>> len(list(itertools.islice(giant_product, 0, 5)))
5
>>> big_product = product(PermutationsReiterable(10), repeat=2)
>>> list(itertools.islice(big_product, 0, 5))
[((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)), 
 ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 9, 8)), 
 ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 8, 7, 9)), 
 ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 8, 9, 7)), 
 ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 9, 7, 8))]

Answer 4

我很抱歉这个话题，但在花了几个小时调试程序试图迭代递归生成的生成器的笛卡尔积。我可以告诉你，如果不像上面的所有例子中那样处理常数，上述解决方案都不起作用。

更正：

from itertools import tee

def product(*iterables, **kwargs):
    if len(iterables) == 0:
        yield ()
    else:
        iterables = iterables * kwargs.get('repeat', 1)
        it = iterables[0]
        for item in it() if callable(it) else iter(it):
            iterables_tee = list(map(tee, iterables[1:]))
            iterables[1:] = [it1 for it1, it2 in iterables_tee]
            iterable_copy = [it2 for it1, it2 in iterables_tee]
            for items in product(*iterable_copy):
                yield (item, ) + items

如果您的生成器包含生成器，则需要将副本传递给递归调用。

大型迭代器的笛卡尔积（itertools）

4 个答案: