Question

我有一个256x256x256 Numpy数组，其中每个元素都是一个矩阵。我需要对每个矩阵进行一些计算，我想使用multiprocessing模块加快速度。

这些计算的结果必须存储在256x256x256数组中，就像原始数组一样，因此原始数组中元素[i,j,k]的矩阵结果必须放在{{1新数组的元素。

为此，我想制作一个列表，该列表可以伪造的方式写为[i,j,k]，并将其传递给一个“多处理”的函数。假设[array[i,j,k], (i, j, k)]是从原始数组中提取的所有矩阵的列表，matrices是执行计算的函数，代码看起来有点像这样：

myfunc

然而，似乎import multiprocessing import numpy as np from itertools import izip def myfunc(finput): # Do some calculations... ... # ... and return the result and the index: return (result, finput[1]) # Make indices: inds = np.rollaxis(np.indices((256, 256, 256)), 0, 4).reshape(-1, 3) # Make function input from the matrices and the indices: finput = izip(matrices, inds) pool = multiprocessing.Pool() async_results = np.asarray(pool.map_async(myfunc, finput).get(999999))实际上是创建了这个巨大的map_async - 列表：我的CPU没有做太多，但是内存和交换在几秒钟内就完全耗尽了显然不是我想要的。

有没有办法将这个庞大的列表传递给多处理函数而无需先显式创建它？或者你知道另一种解决这个问题的方法吗？

非常感谢！： - ）

Answer 1

调用函数后，所有multiprocessing.Pool.map*方法都会完全消耗迭代器^{(demo code)}。要一次为一个块提供迭代器的map函数块，请使用grouper_nofill：

def grouper_nofill(n, iterable):
    '''list(grouper_nofill(3, 'ABCDEFG')) --> [['A', 'B', 'C'], ['D', 'E', 'F'], ['G']]
    '''
    it=iter(iterable)
    def take():
        while 1: yield list(itertools.islice(it,n))
    return iter(take().next,[])

chunksize=256
async_results=[]
for finput in grouper_nofill(chunksize,itertools.izip(matrices, inds)):
    async_results.extend(pool.map_async(myfunc, finput).get())
async_results=np.array(async_results)

PS。 pool.map_async的{{1}}参数做了一些不同的事情：它将iterable分解为块，然后将每个块提供给调用chunksize的工作进程。如果map(func,chunk)完成得太快，这可以为工作进程提供更多数据来咀嚼，但是在你的情况下它没有帮助，因为迭代器在func(item)调用发出后仍然被完全消耗掉。

Answer 2

我也遇到了这个问题。而不是这个：

res = p.map(func, combinations(arr, select_n))

做

res = p.imap(func, combinations(arr, select_n))

imap并没有消耗它！

Answer 3

Pool.map_async()需要知道可迭代的长度，以便将工作分派给多个工作人员。由于izip没有__len__，因此它会首先将迭代转换为列表，从而导致您遇到大量内存使用。

您可以尝试通过使用izip创建自己的__len__样式迭代器来回避这一点。

结合itertools和多处理？

3 个答案: