Python multiprocessing with shared numpy array

时间:2017-02-28 20:58:25

标签: python arrays multithreading numpy

Suppose I created an object A with an 2 dimension numpy array as attributes. Then I created 10 threads using Process API to randomly set the rows of A.

I want to know if I write the following code, whether self.x if shared among all the Process(thread), or every Process(thread) has just a copy?

If not shared, I will lose all the updates, right?

import numpy as np
from multiprocessing import Process

class A:

   def __init__():
       self.x = np.zeros((3,4))

   def update():
        threads = []
        for i in range(10):
            trd = Process(target=self.set, args=(i,))
        threads.append(trd)
        trd.start()

        for i in range(10):
            threads[i].join()

   def set(i):
       self.x[i/3] = np.random.rand(1,4)


if ___main___:
        a = A()
        a.update()

1 个答案:

答案 0 :(得分:1)

不,它不是共享的。您生成多个进程,每个进程复制父进程的文件描述符,并使用无共享对象。

要创建共享共享变量,您必须使用ctype个对象。

因此,而不是将数组声明为 -

self.x = np.zeros((3,4))

您可以使用此Array -

声明它
from multiprocessing import Array
self.x = Array('i', [0]*10)

如果你仍想让numpy数组成为一个共享数组,那么看看这个伟大的answer

这里需要注意的是,它可能并不那么容易。您还必须锁定共享阵列以避免任何竞争条件。