从列表中读取批次

时间:2017-05-18 04:04:10

标签: python

我有以下情况。假设我有一个变量batch_size和一个名为data的列表。我想从batch_size中提取data个元素,这样当我到达终点时,我会回绕。换句话说:

data =[1,2,3,4,5]
batch_size = 4
-> [1,2,3,4], [5,1,2,3], [4,5,1,2], ...

是否有一些很好的惯用方法可以像这样返回切片?起始索引总是batch_size * batch模数为data的长度,但是如果batch_size * (batch+1)超出列表的长度,是否有一种从头开始“环绕”的简单方法?在这种情况下,我当然可以拼凑两片,但我希望有一些非常干净的方法。

我所做的唯一假设是batch_size < len(data)

2 个答案:

答案 0 :(得分:2)

您可以使用itertools.cycle和来自itertools的grouper食谱

import itertools

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return itertools.zip_longest(*args, fillvalue=fillvalue)

data = [1,2,3,4,5]
batch_size = 4
how_many_groups = 5

groups = grouper(itertools.cycle(data), batch_size)
chunks = [next(groups) for _ in range(how_many_groups)]

然后是块的结果:

[(1, 2, 3, 4),
 (5, 1, 2, 3),
 (4, 5, 1, 2),
 (3, 4, 5, 1),
 (2, 3, 4, 5)]

因此,如果您确实需要这些作为列表,那么您必须将其转换为([list(next(groups)) for ...]

答案 1 :(得分:2)

您也可以使用deque模块中的collections并对此类示例进行一次轮换:

from collections import deque

def grouper(iterable, elements, rotations):
    if elements > len(iterable):
        return []

    b = deque(iterable)
    for _ in range(rotations):
        yield list(b)[:elements]
        b.rotate(1)


data = [1,2,3,4,5]
elements = 4
rotations = 5
final = list(grouper(data, elements, rotations))
print(final)

输出:

[[1, 2, 3, 4], [5, 1, 2, 3], [4, 5, 1, 2], [3, 4, 5, 1], [2, 3, 4, 5]]