Tornado并发错误与进程池执行程序一起运行多个进程

时间:2014-10-14 20:55:04

标签: python concurrency multiprocessing tornado concurrent.futures

我尝试运行多个进程,同时使用concurrent.futures.ProcessPoolExecutor运行CPU密集型作业。前几个请求很愉快,但随后从KeyError引发concurrent.futures.process,服务器挂起。

这是龙卷风中的错误吗?

这是我删除代码的最简单形式。

服务器:

# -*- coding: utf-8 -*-
"""
server runs 2 processes and does job on a ProcessPoolExecutor
"""
import tornado.web
import tornado.ioloop
import tornado.gen
import tornado.options
import tornado.httpserver

from concurrent.futures import ProcessPoolExecutor


class MainHandler(tornado.web.RequestHandler):

    executor = ProcessPoolExecutor(1)

    @tornado.gen.coroutine
    def post(self):
        num = int(self.request.body)
        result = yield self.executor.submit(pow, num, 2)
        self.finish(str(result))


application = tornado.web.Application([
    (r"/", MainHandler),
])


def main():
    tornado.options.parse_command_line()
    server = tornado.httpserver.HTTPServer(application)
    server.bind(8888)
    server.start(2)
    tornado.ioloop.IOLoop.instance().start()


if __name__ == '__main__':
    main()

客户端:

# -*- coding: utf-8 -*-
"""
client
"""
from tornado.httpclient import AsyncHTTPClient
from tornado.gen import coroutine
from tornado.ioloop import IOLoop


@coroutine
def remote_compute(num):
    rsp = yield AsyncHTTPClient().fetch(
        'http://127.0.0.1:8888', method='POST', body=str(num))
    print 'result:', rsp.body


IOLoop.instance().run_sync(lambda: remote_compute(10))

错误追溯

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.7_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/local/Cellar/python/2.7.7_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/cliffxuan/.virtualenvs/executor/lib/python2.7/site-packages/concurrent/futures/process.py", line 216, in _queue_management_worker
    work_item = pending_work_items[result_item.work_id]
KeyError: 0

1 个答案:

答案 0 :(得分:7)

当您使用tornado启动具有多个进程的concurrent.futures服务器时,这与tornadoserver.start(2)之间的互动有关。这在内部使用os.fork()来创建两个进程。因为您将Executor声明为类变量,所以在MainHandler类本身执行之前,server.start()实际运行之前,它会被实例化。这意味着两个进程最终共享一个(尽管是分叉的)ProcessPoolExecutor实例。这导致了一些奇怪的问题 - 每个进程都在Executor内部获得了大多数数据结构的写时复制版本,但它们最终实际上共享了相同的工作进程。

ProcessPoolExecutor不支持在此类进程之间共享,因此在第二个进程尝试使用Executor时会遇到问题。您只需在发生Executor后创建fork 即可解决此问题:

class MainHandler(tornado.web.RequestHandler):
    executor = None # None for now

    @tornado.gen.coroutine
    def post(self):
        num = int(self.request.body)
        result = yield self.executor.submit(pow, num, 2)
        self.finish(str(result))


application = tornado.web.Application([
    (r"/", MainHandler),
])


def main():
    tornado.options.parse_command_line()
    server = tornado.httpserver.HTTPServer(application)
    server.bind(8889)
    server.start(2) # We fork here
    MainHandler.executor = ProcessPoolExecutor(1) # Now we can create the Executor
    tornado.ioloop.IOLoop.instance().start()