Question

我有一个在Windows中使用服务员提供的flask应用程序

from flask import Flask

app = Flask(__name__)

@app.route("/index/predict", methods=['POST'])
def prediction():
    data = request.get_json()
    ….Loading Model and other files
    prediction = model.predict_proba(data)
    results = {'0': Good, '1' : Bad}
    return jsonify(results)

from waitress import serve
serve(app, host='0.0.0.0', threads=10, port=8080)

我使用邮递员进行测试，并在70毫秒至200毫秒内得到结果

要检查线程，我有一个py脚本

from multiprocessing import Pool
endpoint = 'http://localhost:8080/index/predict'
payload = loadContent('./payload.json')
def callAPI(x):
    t1 = time.perf_counter()
    r = requests.post(endpoint, json=json.loads(payload))
    print("[%d] elapsed: %f" % (x, time.perf_counter() - t1))
    print("[%d] response: %s" % (x, r.json()))
    return r.json()
        
if __name__ == '__main__': 
    th1 = time.perf_counter()
    p = Pool(processes=15)
    p.map(callAPI, range(5))
    print('overall elapsed: %f'  % (time.perf_counter() - th1))

我确定单个预测最多只需要200毫秒，但是在线程中每个预测最多需要2秒钟，但是5个通话的总时间为3.4秒钟，这很好，但是为什么每个通话需要2秒钟，

当我只打一个电话时，它需要2秒，总时间为3.4秒

我不明白多余的时间从哪里来，这是Windows中的问题还是任何解决此问题的方法，谢谢您的帮助

我尝试在请求中使用会话，但是时间相同，没有任何改善

    sess = requests.Session()    
    r = sess.post(endpoint, json=json.loads(payload),headers={'Connection':'close'})

Windows中的女服务员瓶上的烧瓶

0 个答案: