多处理:将类实例传递给pool.map

时间:2016-05-05 16:09:56

标签: python class multiprocessing

我发誓我在某个地方的例子中看到了以下内容,但是现在我找不到那个例子而且这不起作用。 __call__类函数永远不会被调用。

编辑:代码已更新

pool.map似乎启动了QueueWriter实例并且达到了__call__函数。但是,工作人员似乎永远不会启动,或者至少没有结果从队列中撤出。我的队列设置正确吗?为什么工人不会开火?

import multiprocessing as mp
import os
import random

class QueueWriter(object):
    def __init__(self, **kwargs): 
        self.grid = kwargs.get("grid")
        self.path = kwargs.get("path")

    def __call__(self, q):
        print self.path
        log = open(self.path, "a", 1)
        log.write("QueueWriter called.\n")    
        while 1:
            res = q.get()
            if res == 'kill':
                self.log.write("QueueWriter received 'kill' message. Closing Writer.\n")
                break
            else:
                self.log.write("This is where I'd write: {0} to grid file.\n".format(res))

        log.close()
        log = None

class Worker(object):
    def __init__(self, **kwargs):
        self.queue = kwargs.get("queue")
        self.grid = kwargs.get("grid")

    def __call__(self, idx):
        res = self.workhorse(self, idx)
        self.queue.put((idx,res))
        return res

    def workhorse(self,idx):
        #in reality a fairly complex operation
        return self.grid[idx] ** self.grid[idx]


if __name__ == '__main__':
#     log = open(os.path.expanduser('~/minimal.log'), 'w',1)
    path = os.path.expanduser('~/minimal.log')

    pool = mp.Pool(mp.cpu_count())
    manager = mp.Manager()
    q = manager.Queue()

    grid = [random.random() for _ in xrange(10000)] 
    # in actuality grid is a shared resource, read by Workers and written
    # to by QueueWriter

    qWriter = QueueWriter(grid=grid, path=path)
    watcher = pool.map(qWriter, (q,),1)
    wrkr = Worker(queue=q,grid=grid)
    result = pool.map(wrkr, range(10000), 1)
    result.get()
    q.put('kill')
    pool.close()
    pool.join()    

因此日志确实打印了初始化消息,但是从不调用__call__函数。这是我经常讨论过的那些酸洗问题吗?我找到了关于类成员函数的答案,但是类实例呢?

1 个答案:

答案 0 :(得分:1)

martineau的温柔和耐心的刺激下(谢谢!)我想我已经解决了这些问题。我还没有将它应用到我的原始代码中,但它正在上面的示例中工作,我将针对未来的实现问题开始新的问题。

因此,除了更改目标文件(在此示例中为日志)的代码中的位置之外,我还将QueueWriter实例作为单个多处理过程启动,而不是使用pool.map。正如martineau指出地图调用阻塞,直到qWriter.__call__()返回,这阻止了工作程序被调用。

上面的代码中还有一些其他错误,但这些错误是附带的,并在下面修复:

import multiprocessing as mp
import os
import random

class QueueWriter(object):
    def __init__(self, **kwargs): 
        self.grid = kwargs.get("grid")
        self.path = kwargs.get("path")

    def __call__(self, q):
        print self.path
        log = open(self.path, "a", 1)
        log.write("QueueWriter called.\n")    
        while 1:
            res = q.get()
            if res == 'kill':
                log.write("QueueWriter received 'kill' message. Closing Writer.\n")
                break
            else:
                log.write("This is where I'd write: {0} to grid file.\n".format(res))

        log.close()
        log = None

class Worker(object):
    def __init__(self, **kwargs):
        self.queue = kwargs.get("queue")
        self.grid = kwargs.get("grid")

    def __call__(self, idx):
        res = self.workhorse(idx)
        self.queue.put((idx,res))
        return res

    def workhorse(self,idx):
        #in reality a fairly complex operation
        return self.grid[idx] ** self.grid[idx]


if __name__ == '__main__':
#     log = open(os.path.expanduser('~/minimal.log'), 'w',1)
    path = os.path.expanduser('~/minimal.log')

    pool = mp.Pool(mp.cpu_count())
    manager = mp.Manager()
    q = manager.Queue()

    grid = [random.random() for _ in xrange(10000)] 
    # in actuality grid is a shared resource, read by Workers and written
    # to by QueueWriter

    qWriter = QueueWriter(grid=grid, path=path)
#     watcher = pool.map(qWriter, (q,),1)
# Start the writer as a single process rather than a pool
    p = mp.Process(target=qWriter, args=(q,))
    p.start()
    wrkr = Worker(queue=q,grid=grid)
    result = pool.map(wrkr, range(10000), 1)
#     result.get()
# not required for pool
    q.put('kill')
    pool.close()
    p.join()
    pool.join()