如何将多处理与多线程一起使用?

时间:2013-01-08 19:17:10

标签: python multithreading multiprocessing cherrypy

CherryPy服务器使用线程来处理请求。我的线程服务器中的一个特殊方法非常复杂且CPU很重,所以我必须使用多处理,从方法请求线程内部,到并行化执行。

我以为我只是替换

class Server(object)
    @cherrypy.expose
    def expensive_method(self):
        ...
        x = map(fnc, args)
        ...

def fnc(args):
    # this method doesn't need CherryPy but is expensive to compute
    ...

cherrypy.quickstart(Server())

(工作正常)和

    def expensive_method(self):
        pool = Pool()
        x = pool.map(fnc, args)
        pool.terminate()

但这不起作用。即使在更简单的情况下,当我根本不使用游泳池时,

    def expensive_method(self):
        pool = Pool()
        x = map(fnc, args) # <== no pool here! same as the working example
        pool.terminate()

我得到了一个例外

[08/Jan/2013:20:05:33] ENGINE Caught signal SIGTERM.
2013-01-08 20:05:33,919 : INFO : _cplogging:201 : error(CP Server Thread-3) : [08/Jan/2013:20:05:33] ENGINE Caught signal SIGTERM.
[08/Jan/2013:20:05:33] ENGINE Bus STOPPING
2013-01-08 20:05:33,920 : INFO : _cplogging:201 : error(CP Server Thread-3) : [08/Jan/2013:20:05:33] ENGINE Bus STOPPING
[08/Jan/2013:20:05:38] ENGINE Error in 'stop' listener <bound method Server.stop of <cherrypy._cpserver.Server object at 0x1090c3c90>>
Traceback (most recent call last):
  File "/Volumes/work/workspace/vew/prj/lib/python2.7/site-packages/cherrypy/process/wspbus.py", line 197, in publish
    output.append(listener(*args, **kwargs))
  File "/Volumes/work/workspace/vew/prj/lib/python2.7/site-packages/cherrypy/process/servers.py", line 223, in stop
    wait_for_free_port(*self.bind_addr)
  File "/Volumes/work/workspace/vew/prj/lib/python2.7/site-packages/cherrypy/process/servers.py", line 410, in wait_for_free_port
    raise IOError("Port %r not free on %r" % (port, host))
IOError: Port 8888 not free on '127.0.0.1'

我认为这发生在请求结束时,在pool.terminate()之后或期间。

分叉的工作进程不会对服务器或端口做任何事情。有没有办法告诉CherryPy和/或多处理忽略“服务器位”? fnc中我不需要任何端口或套接字。

我需要在OSX + Linux上使用Python 2.7.1和CherryPy 3.2.2。


进展1:

根据Sylvain的建议,我试过pool = Pool(initializer=cherrypy.server.unsubscribe)。没有更多例外,一切正常,但在日志中我看到了

[08/Jan/2013:21:16:35] ENGINE Caught signal SIGTERM.
2013-01-08 21:16:35,908 : INFO : _cplogging:201 : error(CP Server Thread-10) : [08/Jan/2013:21:16:35] ENGINE Caught signal SIGTERM.
[08/Jan/2013:21:16:35] ENGINE Bus STOPPING
2013-01-08 21:16:35,909 : INFO : _cplogging:201 : error(CP Server Thread-10) : [08/Jan/2013:21:16:35] ENGINE Bus STOPPING
[08/Jan/2013:21:16:35] ENGINE Bus STOPPED
2013-01-08 21:16:35,909 : INFO : _cplogging:201 : error(CP Server Thread-10) : [08/Jan/2013:21:16:35] ENGINE Bus STOPPED
[08/Jan/2013:21:16:35] ENGINE Bus EXITING
2013-01-08 21:16:35,909 : INFO : _cplogging:201 : error(CP Server Thread-10) : [08/Jan/2013:21:16:35] ENGINE Bus EXITING
[08/Jan/2013:21:16:35] ENGINE Bus EXITED
2013-01-08 21:16:35,910 : INFO : _cplogging:201 : error(CP Server Thread-10) : [08/Jan/2013:21:16:35] ENGINE Bus EXITED

这样好吗?可能有任何麻烦(比如,当同时在不同的线程中提供多个请求时)?


进展2:

实际上上面偶尔会抛出空闲进程:(所以它不能正常工作。奇怪的是,这些空闲进程是由Pool生成的,所以它们应该是守护进程,但它们实际上仍然存在杀死父母。


进展3:

我在请求处理方法之外移动了forking(= Pool()调用),但在初始化所有必要状态之后(以便工作进程可以看到此状态)。没有更多的错误或例外。

底线:多处理和多线程无法协同工作。

1 个答案:

答案 0 :(得分:3)

“自我”指的是哪种类型的对象?您在哪个位置初始化并启动分叉进程?也许更多的代码可以帮助诊断问题。

好的,这很好用:

import multiprocessing
import os
import time

import cherrypy

def run_in_sub_proc(size):
    for i in range(size):
        print os.getpid(), i
        time.sleep(1)

pool = multiprocessing.Pool(2)

class Root(object):
    @cherrypy.expose
    def index(self):
        pool.map_async(run_in_sub_proc, (3, 5))

if __name__ == '__main__':
    cherrypy.engine.subscribe('stop', pool.join)
    cherrypy.quickstart(Root())