Python多处理 - 无法加入当前线程

时间:2012-08-07 17:24:47

标签: python multithreading multiprocessing

我正在分割大型ctype数组并并行处理它们。我收到下面的错误,并相信它,因为数组的一个部分正在完成另一个之前的处理。我尝试使用process.join()让第一组进程等待,但这不起作用。想法?

Exception RuntimeError: RuntimeError('cannot join current thread',) in <Finalize object, dead> ignored

使用:

    ....

        with closing(multiprocessing.Pool(initializer=init(array))) as p:
            del array #Since the array is now stored in a shared array destroy the array ref for memory reasons

            step = y // cores
            if step != 0:
                jobs =[]
                for i in range (0, y, step):
                    process = p.Process(target=stretch, args= (shared_arr,slice(i, i+step)),kwargs=options)
                    jobs.append(process)
                    process.start()

                for j in jobs:
                    j.join()

    del jobs
    del process

更新

 #Create an ctypes array
        array = ArrayConvert.SharedMemArray(array)
        #Create a global of options
        init_options(options) #options is a dict
        with closing(multiprocessing.Pool(initializer=init(array))) as p:
            del array #Since the array is not stored in a shared array destroy the array ref for memory reasons


            step = y // cores
            if step != 0:
                for i in range (0, y, step):
                    #Package all the options into a global dictionary

                    p.map_async(stretch,[slice(i, i+step)])

                    #p.apply_async(stretch,args=(shared_arr,slice(i, i+step)),kwargs=options)

        p.join()        

def init_options(options_):
    global kwoptions
    kwoptions = options_

我传递给map_async的函数存储在一个不同的模块中,因此我很难将全局kwoptions传递给该函数。在这样的模块之间传递全局变量似乎是不对的(unpythonic)。这是能够通过map_async传递kwargs的方法。

我是否应该使用不同的东西(应用或处理)重新处理多处理?

2 个答案:

答案 0 :(得分:2)

所以我通过重新编写代码和删除池来实现这一点(根据J.F.Sebastian的评论)。

在伪代码中:

initialize the shared array
determine step size
create an empty list of jobs
create the process, pass it the kwargs, and append it to the job list
start the jobs
join the jobs

以下代码是否有助于任何Google员工:

#Initialize the ctypes array
        init(array)
        #Remove the reference to the array (to preserve memory on multiple iterations.
        del array

        step = y // cores
        jobs = []
        if step != 0:
            for i in range(0,y,step):        
                p = multiprocessing.Process(target=stretch,args= (shared_arr,slice(i, i+step)),kwargs=options)
                jobs.append(p)

            for job in jobs:
                job.start()
            for job in jobs:
                job.join()

答案 1 :(得分:1)

initializer的{​​{1}}参数接受一个函数;将Pool()替换为initializer=init(array)

要将关键字参数传递给与initializer=init, initargs=(array,)系列一起使用的函数f(),您可以创建一个包装器pool.*map*

mp_f()