multiprocessing.Pool.map()删除子类ndarray的属性

时间:2017-10-18 15:03:39

标签: python numpy subclass python-multiprocessing

map()子类的实例列表中使用来自multiprocessing.Pool()的{​​{1}}时,将删除自己类的新属性。

基于numpy docs subclassing example的以下最小示例再现了问题:

numpy.ndarray

删除属性from multiprocessing import Pool import numpy as np class MyArray(np.ndarray): def __new__(cls, input_array, info=None): obj = np.asarray(input_array).view(cls) obj.info = info return obj def __array_finalize__(self, obj): if obj is None: return self.info = getattr(obj, 'info', None) def sum_worker(x): return sum(x) , x.info if __name__ == '__main__': arr_list = [MyArray(np.random.rand(3), info=f'foo_{i}') for i in range(10)] with Pool() as p: p.map(sum_worker, arr_list)

info

使用内置AttributeError: 'MyArray' object has no attribute 'info' 可以正常使用

map()

方法arr_list = [MyArray(np.random.rand(3), info=f'foo_{i}') for i in range(10)] list(map(sum_worker, arr_list2)) 的目的是对象在切片后保留属性

__array_finalize__()

但是对于arr = MyArray([1,2,3], info='foo') subarr = arr[:2] print(subarr.info) ,这种方法在某种程度上不起作用......

1 个答案:

答案 0 :(得分:2)

由于多处理使用pickle将数据序列化到/来自不同的进程,因此这实际上是this question的副本。

根据该问题调整已接受的解决方案,您的示例将变为:

from multiprocessing import Pool
import numpy as np

class MyArray(np.ndarray):

    def __new__(cls, input_array, info=None):
        obj = np.asarray(input_array).view(cls)
        obj.info = info
        return obj

    def __array_finalize__(self, obj):
        if obj is None: return
        self.info = getattr(obj, 'info', None)

    def __reduce__(self):
        pickled_state = super(MyArray, self).__reduce__()
        new_state = pickled_state[2] + (self.info,)
        return (pickled_state[0], pickled_state[1], new_state)

    def __setstate__(self, state):
        self.info = state[-1]
        super(MyArray, self).__setstate__(state[0:-1])

def sum_worker(x):
    return sum(x) , x.info

if __name__ == '__main__':
    arr_list = [MyArray(np.random.rand(3), info=f'foo_{i}') for i in range(10)]
    with Pool() as p:
        p.map(sum_worker, arr_list)

注意,第二个答案表明您可以将pathos.multiprocessing与未经适应的原始代码一起使用,因为路径使用dill而不是pickle。但是,当我测试它时,这不起作用。