Question

我很想知道下面的代码是如何工作的。它来自http://docs.python.org/library/itertools.html#itertools.izip_longest，是izip_longest迭代器的纯python等价物。我对哨兵功能特别神秘，它是如何工作的？

def izip_longest(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
        yield counter()         # yields the fillvalue, or raises IndexError
    fillers = repeat(fillvalue)
    iters = [chain(it, sentinel(), fillers) for it in args]
    try:
        for tup in izip(*iters):
            yield tup
    except IndexError:
        pass

Answer 1

好的，我们可以做到这一点。关于哨兵。表达式([fillvalue]*(len(args)-1))创建一个列表，其中包含args中每个可迭代的一个填充值减1。因此，对于上面的示例['-']。然后为counter分配pop - 该列表的功能。 sentinel本身是generator，在每次迭代时从该列表中弹出一个项目。您可以迭代sentinel返回的每个迭代器一次，它将始终产生fillvalue。 sentinel返回的所有迭代器产生的项目总数为len(args) - 1（感谢Sven Marnach澄清，我误解了它。）

现在看看：

iters = [chain(it, sentinel(), fillers) for it in args]

这就是诀窍。 iters是一个列表，其中包含args中每个可迭代的迭代器。每个迭代器都执行以下操作：

对args。
迭代过一次哨兵，屈服于fillvalue。
永远重复fillvalue。

现在，正如之前所建议的那样，我们只能在len(args)-1之前一起迭代所有的哨兵IndexError次。这很好，因为其中一个迭代是最长的。因此，当我们提出IndexError被提升时，这意味着我们已经完成了对args中最长迭代的迭代。

欢迎你。

P.S。：我希望这是可以理解的。

Answer 2

函数sentinel()返回仅产生fillvalue一次的迭代器。 fillvalue返回的所有迭代器产生的sentinel()总数限制为n-1，其中n是传递给izip_longest()的迭代器数。在此fillvalue个已用尽之后，对sentinel()返回的迭代器的进一步迭代将引发IndexError。

此函数用于检测是否所有迭代器都已耗尽：每个迭代器都使用chain()返回的迭代器进行sentinel()。如果所有迭代器都耗尽，sentinel()返回的迭代器将在n时迭代，导致IndexError，依次触发izip_longest()的结束。

到目前为止，我解释了sentinel()的作用，而不是它是如何工作的。调用izip_longest()时，将评估sentinel()的定义。在评估定义时，每次调用sentinel()时，也会评估izip_longest()的默认参数。代码等同于

fillvalue_list = [fillvalue] * (len(args)-1)
def sentinel():
    yield fillvalue_list.pop()

将此存储在默认参数中而不是在封闭范围中的变量中只是一个优化，因为在默认参数中包含.pop，因为每次返回迭代器时它都会保存查找sentinel()被迭代。

Answer 3

sentinel的定义几乎等同于

def sentinel():
    yield ([fillvalue] * (len(args) - 1)).pop()

除了它将pop绑定方法（函数对象）作为默认参数。在函数定义时评估默认参数，因此每次调用izip_longest一次而不是每次调用sentinel一次。因此，函数对象“记住”列表[fillvalue] * (len(args) - 1)，而不是在每次调用中重新构造它。

itertools中的izip_longest：这里发生了什么？

3 个答案: