CherryPy,多处理和Gevent猴子修补

时间:2012-11-15 17:32:17

标签: python multiprocessing cherrypy gevent monkeypatching

我正在尝试结合使用cherrypy + multiprocessing(以启动工作进程')+ gevent(从工作进程中启动并行i / o greenlet)。看起来最简单的方法就是monkeypatch多处理,因为greenlets只能在主应用程序进程中运行。

然而,看起来猴子修补适用于多处理的某些部分,而不适用于其他部分。这是我的示例CherryPy服务器:

from gevent import monkey
monkey.patch_all()

import gevent
import cherrypy
import multiprocessing

def launch_testfuncs():
    jobs = [gevent.spawn(testfunc)
            for i in range(0, 12)]

    gevent.joinall(jobs, timeout=10)

def testfunc():
    print 'testing'

class HelloWorld(object):
    def index(self):
        launch_testfuncs()

        return "Hello World!"
    index.exposed = True

    def index_proc(self):
        proc = multiprocessing.Process(target=launch_testfuncs)
        proc.start()
        proc.join()

        return "Hello World 2!"
    index_proc.exposed = True

    def index_pool(self):
        pool = multiprocessing.Pool(1)
        return "Hello World 3!"
    index_pool.exposed = True

    def index_namespace(self):
        manager = multiprocessing.Manager()
        anamespace = manager.Namespace()
        anamespace.val = 23
        return "Hello World 4!"
    index_namespace.exposed = True


cherrypy.quickstart(HelloWorld())

猴子修补后的以下工作:

  • index - 直接在cherrypy类中产生greenlets
  • index_proc - 使用multiprocessing.Process启动新流程,然后从该流程中生成greenlets

以下问题:

  • index_pool - 启动multiprocessing.Pool - 挂起,永不退货
  • index_namespace - 初始化multiprocessing.Manager命名空间以管理工作池/工作人员集合中的共享内存 - 返回以下错误消息:

    [15/Nov/2012:17:19:31] HTTP Traceback (most recent call last):
      File "/Library/Python/2.7/site-packages/cherrypy/_cprequest.py", line 656, in respond
    response.body = self.handler()
      File "/Library/Python/2.7/site-packages/cherrypy/lib/encoding.py", line 188, in __call__
    self.body = self.oldhandler(*args, **kwargs)
      File "/Library/Python/2.7/site-packages/cherrypy/_cpdispatch.py", line 34, in __call__
    return self.callable(*self.args, **self.kwargs)
      File "server.py", line 39, in index_namespace
    anamespace = manager.Namespace()
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 667, in temp
    token, exp = self._create(typeid, *args, **kwds)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 565, in _create
    conn = self._Client(self._address, authkey=self._authkey)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/connection.py", line 175, in Client
    answer_challenge(c, authkey)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/connection.py", line 414, in answer_challenge
    response = connection.recv_bytes(256)        # reject large message
    IOError: [Errno 35] Resource temporarily unavailable
    

我尝试在gevent文档中找到与此相关的一些文档,但找不到任何与此有关的内容。是不是gevent的猴子修补不完整?还有其他人有过类似的问题并且有办法吗?

2 个答案:

答案 0 :(得分:1)

问题似乎是gevent.socket非阻塞的结果,这意味着如果socket.recv_bytes(X)字节在套接字上不能立即可用,则任何X调用都将抛出该错误。具体来说,gevent.socket设计为不会阻塞套接字。

multiprocessing出现了问题,因为它使用stdlib socket模块并期望它阻止,而在您monkey.patch_all()之后,socket模块已被替换,multiprocessing.connection并非旨在处理新的异步行为。

您可以告诉monkey不要修补socket,但这意味着在您的应用程序中利用异步套接字的任何内容都可能因此而导致性能损失。

要执行此操作,请使用patch_allsocket=False致电patch_all(socket=False)

这不是一个理想的解决方案,因为您在使用gevent时首先会失去很多好处。

答案 1 :(得分:1)

我遇到了和你一样的问题。我的解决方案是我刚刚使用multiprocessing.Process()来生成固定数量的进程。最后加入所有,等待完成。

#!/usr/bin/env python
# encoding: utf-8

from gevent import monkey
monkey.patch_all()

import gevent
import multiprocessing as mp


NUM = 10


def work(i):
    jobs = [gevent.spawn(func, i)
            for i in range(0, 12)]
    gevent.joinall(jobs)
    print "{} Done {}".format(mp.current_process().name, i)


def func(x):
    print "Gevent: {}".format(x)

def main():

    processes = [mp.Process(name="Process-{}".format(i), target=work, args=(i,)) for i in xrange(NUM)]

    for process in processes:
        process.start()

    for process in processes:
        process.join()


if __name__ == '__main__':
    main()

输出

Gevent: 0
Gevent: 1
Gevent: 2
Gevent: 3
Gevent: 4
Gevent: 5
Gevent: 6
Gevent: 7
Gevent: 8
Gevent: 9
Gevent: 10
Gevent: 11
Process-0 Done 11
Gevent: 0
Gevent: 1
Gevent: 2
Gevent: 3
Gevent: 4
Gevent: 5
Gevent: 6
Gevent: 7
Gevent: 8
Gevent: 9
Gevent: 10
Gevent: 11
Process-1 Done 11
Gevent: 0
Gevent: 1
Gevent: 2
Gevent: 3
Gevent: 4
Gevent: 5
Gevent: 6
Gevent: 7
Gevent: 8
Gevent: 9
Gevent: 10
Gevent: 11
Process-2 Done 11
Gevent: 0
... ...

这是gevent使用多处理的解决方案。