我正在尝试使用itertools.combinations
和itertools.slice
函数来创建许多批处理,这些批处理可以并行执行计算。我使用以下功能创建批次:
def construct_batches(n,k,batch_size):
combinations_slices = []
# Calculate number of batches
n_batches = math.ceil(comb(n,k,exact=True)/batch_size)
# Construct iterator for combinations
combinations = itertools.combinations(range(n),k)
while len(combinations_slices) < n_batches:
combinations_slices.append(itertools.islice(combinations,batch_size))
return combinations_slices
执行一些计算后,我找出了哪些批次和元素是相关的。因此,我有一个批次列表(例如batches = [2,3,1]
)和一个元素列表(例如elements = [5,7,0]
)。令我惊讶的是,python具有以下行为。假设我要检查切片是否正确。然后
combinations_slices = construct_batches(n,k,batch_size)
list(combinations_slices[0])
Out[491]:
[(0, 1, 2, 3),
(0, 1, 2, 4),
(0, 1, 2, 5),
(0, 1, 2, 6),
(0, 1, 2, 7),
(0, 1, 2, 8),
(0, 1, 2, 9),
(0, 1, 3, 4),
(0, 1, 3, 5),
(0, 1, 3, 6)]
list(combinations_slices[1])
Out[492]:
[(0, 1, 3, 7),
(0, 1, 3, 8),
(0, 1, 3, 9),
(0, 1, 4, 5),
(0, 1, 4, 6),
(0, 1, 4, 7),
(0, 1, 4, 8),
(0, 1, 4, 9),
(0, 1, 5, 6),
(0, 1, 5, 7)]
这一切都很好,很愉快,表明该方法行之有效。但是,如果我使用列表推导将“相关”批次选择为combinations_slices = [combinations_slices[i] for i in range(len(combinations_slices)) if i in batches]
,则输出为(可悲):
combinations_slices = construct_batches(n,k,batch_size)
batches = [2,3,1]
combinations_slices = [combinations_slices[i] for i in range(len(combinations_slices)) if i in batches]
list(combinations_slices[0])
Out[509]:
[(0, 1, 2, 3),
(0, 1, 2, 4),
(0, 1, 2, 5),
(0, 1, 2, 6),
(0, 1, 2, 7),
(0, 1, 2, 8),
(0, 1, 2, 9),
(0, 1, 3, 4),
(0, 1, 3, 5),
(0, 1, 3, 6)]
list(combinations_slices[1])
Out[510]:
[(0, 1, 3, 7),
(0, 1, 3, 8),
(0, 1, 3, 9),
(0, 1, 4, 5),
(0, 1, 4, 6),
(0, 1, 4, 7),
(0, 1, 4, 8),
(0, 1, 4, 9),
(0, 1, 5, 6),
(0, 1, 5, 7)]
有什么方法可以在不将所有内容都投射到列表的情况下获得所需的行为(通常,这些组合的列表可能很大,所以我会用光内存...)?建议表示赞赏...