我试图通过以下代码找出如何有效地利用两个进程池(Flask实例和工作程序)创建解决方案:
import argparse
import uuid
import sqlite3
from flask import Flask, request, jsonify, make_response
from multiprocessing import Queue, Process, Lock, Manager
from ServerUtils import StandaloneApplication
from queue import Empty
app = Flask(__name__)
authorization_tokens = ["dummy_token"]
jobs = []
def worker(request_queue, type, manager_dict):
worker_instance = Worker()
conn = sqlite3.connect("main.db")
c = conn.cursor()
while True:
if not request_queue.empty():
try:
text, request_id = request_queue.get_nowait()
except Empty:
continue
try:
result = worker_instance.work(text)
c.execute("INSERT INTO parsed VALUES(?, ?)", (request_id, result))
conn.commit()
except:
c.execute("INSERT INTO parsed VALUES(?, ?)", (request_id, "ERROR"))
conn.commit()
@app.route("/", methods=["POST"])
def parse():
conn = sqlite3.connect("main.db")
c = conn.cursor()
token = request.headers["Authorization"] if "Authorization" in request.headers else ""
if token in authorization_tokens:
body = request.get_json()
request_id = str(uuid.uuid4())
queues[body["Type"]].put((body["Text"], request_id))
answer = None
while True:
try:
answer = c.execute("SELECT * FROM parsed WHERE id_ LIKE ?", (request_id,)).fetchall()
if not answer:
continue
else:
break
except:
import pdb; pdb.set_trace()
break
response = { "body" : { "Parsed text" : answer[0] } }
return make_response(jsonify(response), 200)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--workers", type=int)
parser.add_argument("--types", nargs="+", type=str)
args = parser.parse_args()
queues = { type: Queue() for type in args.types }
for num, type in zip([i for i in range(args.workers)], args.types):
jobs.append(Process(target=worker,
args=(queues[type],
type,
manager_dict)))
[job.start() for job in jobs]
options = {
'bind': '%s:%s' % ('127.0.0.1', '5000'),
'workers': args.workers + 2,
'timeout': 300,
}
StandaloneApplication(app, options).run()
我正在使用以下命令运行此代码:
python main.py --workers 4
主要思想是,我有4个进程同时处理对服务器的单独请求,但是它们使用的是一个工作池,所以基本上:
此过程将文本放入要解析的multiprocessing.Queue对象:
queues[body["Type"]].put((body["Text"], request_id))
其中一个工作人员从队列中获得此任务:text, request_id = request_queue.get_nowait()
问题在于,在这种情况下,多处理仍比仅使用一个工作程序要慢:
python main.py --workers 1
为什么?