多处理Pool.imap坏了吗?

时间:2011-03-30 02:06:58

标签: python multiprocessing

我已经尝试了python2.6 Ubuntu中包含的多处理功能 package(__version__表示0.70a1)和最新的PyPI(2.6.2.1)。 在这两种情况下我都不知道如何正确使用imap - 它会导致 整个解释器停止响应ctrl-C(虽然地图工作正常)。 pdb显示next()挂起wait()中的条件变量IMapIterator,因此没有人将我们唤醒。任何提示?谢谢 提前。

$ cat /tmp/go3.py
import multiprocessing as mp
print mp.Pool(1).map(abs, range(3))
print list(mp.Pool(1).imap(abs, range(3)))

$ python /tmp/go3.py
[0, 1, 2]
^C^C^C^C^C^\Quit

2 个答案:

答案 0 :(得分:10)

首先注意这是有效的:

import multiprocessing as mp
import multiprocessing.util as util
pool=mp.Pool(1)
print list(pool.imap(abs, range(3)))

不同之处在于,当pool的调用结束时,pool.imap()无法完成。

相比之下,

print(list(mp.Pool(1).imap(abs, range(3))))

导致Pool实例在imap调用结束后很快完成。 缺少引用会导致调用Finalizerself._terminate类中称为Pool)。这启动了一系列命令,这些命令拆除了任务处理程序线程,结果处理程序线程,工作程序子进程等。

这一切都发生得如此之快,至少在大多数运行中,发送给任务处理程序的任务都没有完成。

以下是相关的代码:

来自/usr/lib/python2.6/multiprocessing/pool.py:

class Pool(object):
    def __init__(self, processes=None, initializer=None, initargs=()):
        ...
        self._terminate = Finalize(
            self, self._terminate_pool,
            args=(self._taskqueue, self._inqueue, self._outqueue, self._pool,
                  self._task_handler, self._result_handler, self._cache),
            exitpriority=15
            )

/usr/lib/python2.6/multiprocessing/util.py:

class Finalize(object):
    '''
    Class which supports object finalization using weakrefs
    '''
    def __init__(self, obj, callback, args=(), kwargs=None, exitpriority=None):
        ...
        if obj is not None:
            self._weakref = weakref.ref(obj, self)   

weakref.ref(obj,self)导致在self()即将完成时调用obj

我使用调试命令util.log_to_stderr(util.SUBDEBUG)来学习事件序列。例如:

import multiprocessing as mp
import multiprocessing.util as util
util.log_to_stderr(util.SUBDEBUG)

print(list(mp.Pool(1).imap(abs, range(3))))

产量

[DEBUG/MainProcess] created semlock with handle 3077013504
[DEBUG/MainProcess] created semlock with handle 3077009408
[DEBUG/MainProcess] created semlock with handle 3077005312
[DEBUG/MainProcess] created semlock with handle 3077001216
[INFO/PoolWorker-1] child process calling self.run()
[SUBDEBUG/MainProcess] finalizer calling <bound method type._terminate_pool of <class 'multiprocessing.pool.Pool'>> with args (<Queue.Queue instance at 0x9d6e62c>, <multiprocessing.queues.SimpleQueue object at 0x9cf04cc>, <multiprocessing.queues.SimpleQueue object at 0x9d6e40c>, [<Process(PoolWorker-1, started daemon)>], <Thread(Thread-1, started daemon -1217967248)>, <Thread(Thread-2, started daemon -1226359952)>, {0: <multiprocessing.pool.IMapIterator object at 0x9d6eaec>}) and kwargs {}
[DEBUG/MainProcess] finalizing pool
...

并将其与

进行比较
import multiprocessing as mp
import multiprocessing.util as util
util.log_to_stderr(util.SUBDEBUG)
pool=mp.Pool(1)
print list(pool.imap(abs, range(3)))

产生

[DEBUG/MainProcess] created semlock with handle 3078684672
[DEBUG/MainProcess] created semlock with handle 3078680576
[DEBUG/MainProcess] created semlock with handle 3078676480
[DEBUG/MainProcess] created semlock with handle 3078672384
[INFO/PoolWorker-1] child process calling self.run()
[DEBUG/MainProcess] doing set_length()
[0, 1, 2]
[INFO/MainProcess] process shutting down
[DEBUG/MainProcess] running all "atexit" finalizers with priority >= 0
[SUBDEBUG/MainProcess] calling <Finalize object, callback=_terminate_pool, args=(<Queue.Queue instance at 0xb763e60c>, <multiprocessing.queues.SimpleQueue object at 0xb76c94ac>, <multiprocessing.queues.SimpleQueue object at 0xb763e3ec>, [<Process(PoolWorker-1, started daemon)>], <Thread(Thread-1, started daemon -1218274448)>, <Thread(Thread-2, started daemon -1226667152)>, {}), exitprority=15>
...
[DEBUG/MainProcess] finalizing pool

答案 1 :(得分:0)

就我而言,我在调用 composeOptions 时没有期待返回值,也没有让它工作。但是,如果我用 pool.imap() 尝试它,它工作正常。问题与之前的回答完全一样:没有调用终结器,因此进程在启动之前就被有效地转储了。

解决方案是调用终结器,例如 pool.map() 函数。这导致它正常工作,因为它现在需要将履行交给列表函数,因此该过程被执行。简而言之,解释如下(当然,这是简化的。现在,假设它是有用的):

list()

就我而言,简单的解决方案是将整个 from multiprocessing import Pool from shutil import copy from tqdm import tqdm filedict = { r"C:\src\file1.txt": r"C:\trg\file1_fixed.txt", r"C:\src\file2.txt": r"C:\trg\file2_fixed.txt", r"C:\src\file3.txt": r"C:\trg\file3_fixed.txt", r"C:\src\file4.txt": r"C:\trg\file4_fixed.txt" } # target process def copyfile(srctrg): copy(srctrg[0],srctrg[1]) return True # a couple of trial processes for illustration with Pool(2) as pool: # works fine with map, but cannot utilize tqdm() since no iterator object is returned pool.map(copyfile,list(filedict.items())) # will not work, since no finalizer is called for imap tqdm(pool.imap(copyfile,list(filedict.items()))) # NOT WORKING # this works, since the finalization is forced for the process list(tqdm(pool.imap(copyfile,list(filedict.items())))) 包含在 tqdm(pool.imap(...)) 中以强制执行。