在Flask中使用joblib进行并行计算

时间:2018-05-24 09:55:25

标签: python flask parallel-processing joblib

我有一个python函数需要用不同的参数值重复调用。我想在多个CPU上并行执行此操作。我已成功使用joblib模块完成此操作。我现在想使用flask在多个CPU上运行AWS EC2 instance,将我的代码作为Web应用程序提供。这是我尝试过的玩具示例:

from flask import Flask
from joblib import Parallel, delayed
from time import sleep

def myfunc(x):
    sleep(5)
    return x

application = Flask(__name__)

@application.route('/', methods = ['GET'])
def getresult():
    out = Parallel(n_jobs=-1, verbose=10)(delayed(myfunc)(i) for i in range(5))
    return str(sum(out))

if __name__ == "__main__":
    application.debug = True
    application.run()

问题是此代码 不能跨多个CPU并行运行 。我得到以下警告和输出(经过的时间确认它没有并行运行):

    /Library/anaconda/lib/python3.6/site-packages/joblib/parallel.py:547:
    UserWarning: Multiprocessing-backed parallel loops cannot be nested below 
    threads, setting n_jobs=1
      **self._backend_args)
    [Parallel(n_jobs=-1)]: Done   1 out of   1 | elapsed:    5.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   2 out of   2 | elapsed:   10.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   3 out of   3 | elapsed:   15.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   4 out of   4 | elapsed:   20.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   5 out of   5 | elapsed:   25.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   5 out of   5 | elapsed:   25.0s finished

有什么建议吗?

1 个答案:

答案 0 :(得分:0)

看看你到达那里的用户警告:

UserWarning: Multiprocessing-backed parallel loops cannot be nested below 
threads, setting n_jobs=1

也许这会有所帮助:

Multiprocessing backed parallel loops cannot be nested below threads, setting n_jobs=1

Flask可能会在引擎盖下旋转它自己的线程, 所以你的getresult()可能无法在MainThread中运行。