我正在使用python多处理。使用Pool启动并发进程,使用RawArray在并发进程之间共享数组。我不需要同步RawArray的访问,也就是说,任何进程都可以随时修改数组。
RawArray的测试代码是:(不要介意程序的含义,因为它只是一个测试。)
from multiprocessing.sharedctypes import RawArray
import time
sieve = RawArray('i', (10 + 1)*[1]) # shared memory between processes
import multiprocessing as mp
def foo_pool(x):
time.sleep(0.2)
sieve[x] = x*x # modify the shared memory array. seem not work ?
return x*x
result_list = []
def log_result(result):
result_list.append(result)
def apply_async_with_callback():
pool = mp.Pool(processes = 4)
for i in range(10):
pool.apply_async(foo_pool, args = (i,), callback = log_result)
pool.close()
pool.join()
print(result_list)
for x in sieve:
print (x) # !!! sieve is [1, 1, ..., 1]
if __name__ == '__main__':
apply_async_with_callback()
虽然代码没有按预期工作。我评论了关键陈述。我被困在这一整天。任何帮助或建设性的建议将非常感激。
答案 0 :(得分:0)
time.sleep
失败,因为您没有import time
sieve[x] = x*x
修改数组而不是sieve[x].value = x*x
,您的代码会在每个子流程中创建一个新的sieve
。您需要传递对共享数组的引用,例如:
def foo_init(s):
global sieve
sieve = s
def apply_async_with_callback():
pool = mp.Pool(processes = 4, initializer=foo_init, initargs=(sieve,))
if __name__ == '__main__':
sieve = RawArray('i', (10 + 1)*[1])
答案 1 :(得分:0)
您应该使用多线程而不是多处理,因为线程可以本地共享主进程的内存。
如果您担心python的GIL机制,也许可以求助于numba
的{{3}}。
答案 2 :(得分:0)
工作版本:
from multiprocessing import Pool, RawArray
import time
def foo_pool(x):
sieve[x] = x * x # modify the shared memory array.
def foo_init(s):
global sieve
sieve = s
def apply_async_with_callback(loc_size):
with Pool(processes=4, initializer=foo_init, initargs=(sieve,)) as pool:
pool.map(foo_pool, range(loc_size))
for x in sieve:
print(x)
if __name__ == '__main__':
size = 50
sieve = RawArray('i', size * [1]) # shared memory between processes
apply_async_with_callback(size)