我第一次使用joblib。我在Windows上使用jupyter笔记本。它是16核心机器。与单个进程相比,使用joblib并行处理时,我的代码运行速度要慢得多。我可以看到joblib为这项工作创建了流程。但是除了一个使用6%CPU的CPU外,它们都使用不到2%的CPU。所以总的来说,它只使用了高达11%的CPU。我不确定我做错了什么。以下是我的代码
prepare_function.py
def intersect_fire(index, grid_cell, index2, fire):
if grid_cell['geometry'].intersects(fire['geometry']):
grid_index=index
UUID=fire.UUID
data = {'GridIndex':grid_index,'geometry':grid_cell['geometry'].intersection(fire['geometry']),'UUID':UUID}
return data
并行处理代码
%%time
from prepare_function import intersect_fire
from joblib import Parallel, delayed
if __name__ == '__main__':
data = Parallel(n_jobs=-1)(delayed(intersect_fire)(index, grid_cell, index2, fire) for index, grid_cell in poly1.iterrows() for index2, fire in poly2.iterrows())
data =[x for x in data if x is not None]
我还使用在线的一些示例代码进行了测试,并且仅使用总共7%的CPU进行测试。
test_function.py
def multiple(a, b):
return a*b
joblib并行处理代码
%%time
from test_function import multiple
from joblib import Parallel, delayed
if __name__ == '__main__':
Parallel(n_jobs=-1)(delayed(multiple)(a=i, b=j) for i in range(1, 6000) for j in range(11, 1600))