示例：

Question

你能想出一个很好的方法（可能用itertools）将迭代器拆分成给定大小的块吗？

因此l=[1,2,3,4,5,6,7] chunks(l,3)成为迭代器[1,2,3], [4,5,6], [7]

我可以想到一个小程序可以做到这一点，但不是一个很好的方式可能是itertools。

Answer 1

grouper()文档recipes中的itertools食谱接近你想要的内容：

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

但它会用填充值填充最后一个块。

一种不太通用的解决方案，仅适用于序列，但可以根据需要处理最后一个块

[my_list[i:i + chunk_size] for i in range(0, len(my_list), chunk_size)]

最后，适用于一般迭代器的解决方案是

def grouper(n, iterable):
    it = iter(iterable)
    while True:
       chunk = tuple(itertools.islice(it, n))
       if not chunk:
           return
       yield chunk

Answer 2

虽然OP要求函数将块作为列表或元组返回，但是如果需要返回迭代器，则可以修改Sven Marnach's解决方案：

def grouper_it(n, iterable):
    it = iter(iterable)
    while True:
        chunk_it = itertools.islice(it, n)
        try:
            first_el = next(chunk_it)
        except StopIteration:
            return
        yield itertools.chain((first_el,), chunk_it)

一些基准：http://pastebin.com/YkKFvm8b

只有当你的函数遍历每个块中的元素时，它才会稍微高效一些。

Answer 3

这适用于任何可迭代的。它返回发电机的发电机（充分灵活）。我现在意识到它与@reclosedevs解决方案基本相同，但没有绒毛。 try...except传播时不需要StopIteration，这就是我们想要的。

当iterable为空时，需要next(iterable)调用StopIteration，因为islice将继续产生空生成器，如果你允许的话。

它更好，因为它只有两行，但很容易理解。

def grouper(iterable, n):
    while True:
        yield itertools.chain((next(iterable),), itertools.islice(iterable, n-1))

请注意，next(iterable)被放入元组中。否则，如果next(iterable)本身是可迭代的，那么itertools.chain会将其展平。感谢Jeremy Brown指出了这个问题。

Answer 4

我今天正在研究一些事情，并提出了我认为简单的解决方案。它类似于jsbueno's回答，但我相信当group的长度可以被iterable整除时，他会产生空n秒。当iterable用尽时，我的答案会进行简单的检查。

def chunk(iterable, chunk_size):
    """Generate sequences of `chunk_size` elements from `iterable`."""
    iterable = iter(iterable)
    while True:
        chunk = []
        try:
            for _ in range(chunk_size):
                chunk.append(iterable.next())
            yield chunk
        except StopIteration:
            if chunk:
                yield chunk
            break

Answer 5

这是一个返回懒惰的块;如果您想要列表，请使用map(list, chunks(...))。

from itertools import islice, chain
from collections import deque

def chunks(items, n):
    items = iter(items)
    for first in items:
        chunk = chain((first,), islice(items, n-1))
        yield chunk
        deque(chunk, 0)

if __name__ == "__main__":
    for chunk in map(list, chunks(range(10), 3)):
        print chunk

    for i, chunk in enumerate(chunks(range(10), 3)):
        if i % 2 == 1:
            print "chunk #%d: %s" % (i, list(chunk))
        else:
            print "skipping #%d" % i

Answer 6

简洁的实施是：

chunker = lambda iterable, n: (ifilterfalse(lambda x: x == (), chunk) for chunk in (izip_longest(*[iter(iterable)]*n, fillvalue=())))

这是有效的，因为[iter(iterable)]*n是一个包含相同迭代器n次的列表;压缩它会从列表中的每个迭代器中获取一个项目，是相同的迭代器，结果是每个zip-element包含一组n个项目。

需要

izip_longest来完全使用底层迭代，而不是在达到第一个耗尽迭代器时停止迭代，从而切断iterable的任何余数。这导致需要过滤掉填充值。因此，稍微强一些的实现将是：

def chunker(iterable, n):
    class Filler(object): pass
    return (ifilterfalse(lambda x: x is Filler, chunk) for chunk in (izip_longest(*[iter(iterable)]*n, fillvalue=Filler)))

这可以保证填充值永远不是底层iterable中的项。使用上面的定义：

iterable = range(1,11)

map(tuple,chunker(iterable, 3))
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10,)]

map(tuple,chunker(iterable, 2))
[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]

map(tuple,chunker(iterable, 4))
[(1, 2, 3, 4), (5, 6, 7, 8), (9, 10)]

这个实现几乎可以满足您的需求，但它存在问题：

def chunks(it, step):
  start = 0
  while True:
    end = start+step
    yield islice(it, start, end)
    start = end

（不同之处在于，因为islice没有提出StopIteration或其他任何超出it结尾的调用，这将永远产生;还有一个棘手的问题{{{}在迭代此生成器之前必须消耗1}}结果。

在功能上生成移动窗口：

islice

所以这就变成了：

izip(count(0, step), count(step, step))

但是，仍然会创建一个无限的迭代器。因此，您需要花费时间（或者可能是其他可能更好的东西）来限制它：

(it[start:end] for (start,end) in izip(count(0, step), count(step, step)))

Answer 7

“更简单比复杂更好” - 一个简单的发电机几行可以完成这项工作。只需将它放在一些实用程序模块中：

def grouper (iterable, n):
    iterable = iter(iterable)
    count = 0
    group = []
    while True:
        try:
            group.append(next(iterable))
            count += 1
            if count % n == 0:
                yield group
                group = []
        except StopIteration:
            yield group
            break

Answer 8

我忘了在哪里找到了灵感。我已经对Windows注册表中的MSI GUID进行了一些修改：

def nslice(s, n, truncate=False, reverse=False):
    """Splits s into n-sized chunks, optionally reversing the chunks."""
    assert n > 0
    while len(s) >= n:
        if reverse: yield s[:n][::-1]
        else: yield s[:n]
        s = s[n:]
    if len(s) and not truncate:
        yield s

reverse并不适用于您的问题，但这是我在此功能中广泛使用的内容。

>>> [i for i in nslice([1,2,3,4,5,6,7], 3)]
[[1, 2, 3], [4, 5, 6], [7]]
>>> [i for i in nslice([1,2,3,4,5,6,7], 3, truncate=True)]
[[1, 2, 3], [4, 5, 6]]
>>> [i for i in nslice([1,2,3,4,5,6,7], 3, truncate=True, reverse=True)]
[[3, 2, 1], [6, 5, 4]]

Answer 9

你走了。

def chunksiter(l, chunks):
    i,j,n = 0,0,0
    rl = []
    while n < len(l)/chunks:        
        rl.append(l[i:j+chunks])        
        i+=chunks
        j+=j+chunks        
        n+=1
    return iter(rl)


def chunksiter2(l, chunks):
    i,j,n = 0,0,0
    while n < len(l)/chunks:        
        yield l[i:j+chunks]
        i+=chunks
        j+=j+chunks        
        n+=1

示例：

for l in chunksiter([1,2,3,4,5,6,7,8],3):
    print(l)

[1, 2, 3]
[4, 5, 6]
[7, 8]

for l in chunksiter2([1,2,3,4,5,6,7,8],3):
    print(l)

[1, 2, 3]
[4, 5, 6]
[7, 8]


for l in chunksiter2([1,2,3,4,5,6,7,8],5):
    print(l)

[1, 2, 3, 4, 5]
[6, 7, 8]

在Python中用块（n）迭代迭代器？

9 个答案:

示例：