异步进程池执行程序在龙卷风中不起作用

时间:2019-09-30 14:08:56

标签: python asynchronous parallel-processing tornado

在这一点上,我觉得我还差强人意。我在使用龙卷风框架的同时还利用了ProcessPoolExecutor

import tornado.ioloop
import datetime
import tornado.web
import azure.functions as func
import json
import logging
import requests
import sys
import os
import pymongo
import mongo_config
import re
import concurrent.futures
from azure_model import Pytorch_Azure

MONGO_URL = mongo_config.uri()
mongo_client = pymongo.MongoClient(MONGO_URL)
db = mongo_client['db']
prods = mongo_client['db']['products']

pta = Pytorch_Azure()

def parallel_pred(img):
    r = requests.get(img, timeout = 10)
    img_id = img.split('/')[-1].split('.')[0]
    img_name = 'tmp{}.png'.format(img_id)
    with open(img_name, 'wb') as f:
        f.write(r.content)
    prediction = pta.predict(img_name)
    os.remove(img_name)
    return prediction

class Predictionator(tornado.web.RequestHandler):
    def data_received(self, chunk):
        pass

    def get(self):
        merchant_id = self.get_argument('id', None, True)
        prod_type = self.request.uri.split('category=')[1].split('&id=')[0].replace('%20', ' ').replace('%26', '&').replace('%27', '\'')
        pred_list = []
        outputs = {}
        print(type(prod_type))
        if merchant_id and prod_type:

            counter = 0
            try:
                print(prod_type)
                for i in prods.find({'merchant': int(merchant_id), 'details.product_type':re.compile('^' + prod_type + '$', re.IGNORECASE)}):


                    prod_img = i['merchantImages'][0]
                    if prod_img not in pred_list:
                        pred_list.append(prod_img)
                        counter += 1
                        if counter == 5:
                            break
            except:
                self.write({'body': 'There was an error with the query. Please ensure you are using a correct merchant id and product type'})
            print(pred_list)

            if pred_list:

                try:   

                    executor = concurrent.futures.ProcessPoolExecutor(4)
                    for pred_out in executor.map(parallel_pred, pred_list, timeout = 15):
                            if pred_out['label'] not in outputs.keys():
                                outputs[pred_out['label']] = 1
                            else:
                                outputs[pred_out['label']] += 1


                except:
                    self.write({'body': 'There was an issue making the predictions.'})


                if outputs:
                    prediction = {}
                    prediction['label'] = max(outputs, key = outputs.get)
                    prediction['object_id'] = db.categories.find_one({'name':prediction['label']})['_id']

                    print(outputs)
                    self.write(json.dumps(prediction))
                else:
                    self.write({'statusCode': 400, 'body':'An error occurred.'})
            else:
                self.write({'statusCode': 400, 'body':'There were no results returned. Please ensure the id parameter has a valid merchant id and the category id has a valid product type'})
        else:
            self.write({'statusCode': 400, 'body':'Please pass a name on the query string or in the request body'})




def make_app():
    return tornado.web.Application([
        (r'/categorize',Predictionator),
    ])

def start_nado():
    print('starting nado')
    app = make_app()
    server = app.listen(8888)
    return server

def restart():
    python = sys.executable
    os.execl(python, python, * sys.argv)

def stop_nado():
    ioloop = tornado.ioloop.IOLoop.instance()
    ioloop.add_callback(ioloop.stop)
    ioloop.add_callback(ioloop.close)
    print('stopping nado')

def main():
    while True:
        try:
            try:
                server = start_nado()
                tornado.ioloop.IOLoop.current().add_timeout(datetime.timedelta(seconds=600), stop_nado)
                tornado.ioloop.IOLoop.current().start()
            except OSError:
                print('restarting')
                restart()
        except KeyboardInterrupt:
            tornado.ioloop.IOLoop.instance().stop()
            break


if __name__ == "__main__":
    try:
        main()
    except OSError:
        tornado.ioloop.IOLoop.instance.stop()
        main()

主要问题来自Predictionator类。这个想法是,它从数据库中提取5个产品,并对每个产品进行预测,然后返回哪个类别的预测最多。效果很好,但是花了一段时间,所以我们希望将其与流程并行化。 第一个问题来自挂断,它将对两个对象进行预测,然后变得完全无响应。此时tornado.ioloop.IOLoop.current().add_timeout(datetime.timedelta(seconds=600), stop_nado)成为解决方案,实际上每10分钟重新启动龙卷风服务器。 此后,出现OSError: [Errno 24] Too many open files错误。这是restart函数成为[hacky]解决方案的时候,几乎可以重新启动程序。 整个过程在大约2天的时间内运行良好,此后运行的服务器完全没有响应。在这一点上,我只是在寻找正确的方向,我怀疑龙卷风是问题所在,但我是否应该完全使用其他框架?对于使用Python进行龙卷风和并行处理,我是一个新手。 谢谢

1 个答案:

答案 0 :(得分:0)

因为每次满足if pred_list条件,您都将创建4个新进程。

通常,在Tornado程序中,您将创建一个全局executor对象并重新使用它。

# create a global object
executor = concurrent.futures.ProcessPoolExecutor(4)

class Predictionator(...):
    ...
    def get():
        ...
        # use the global `executor` object instead of creating a new one
        for pred_out in executor.map(...)

另一种方法是在executor语句中创建with...as,这些进程将在完成任务后自动关闭并清除。

def get():
    ...
    with concurrent.futures.ProcessPoolExecutor(4) as executor:
        for pred_out in executor.map(...)

第一种方法将为您提供更好的性能。在第二种方法中,创建和关闭流程会涉及开销。