在使用dask多处理调度程序很长一段时间之后,我注意到多处理工作程序在实例化之后似乎比直接占用更多内存。
使用多处理调度程序是否有必要注意避免内存泄漏的特殊问题?如何从python中强制终止并重启多处理调度程序的dask worker?
编辑:在我的用例中,我注意到多处理调度程序比分布式调度程序快〜2倍。因此,我特别关注如何终止此调度程序的工作人员。
答案 0 :(得分:1)
对于占用大量内存的任务,我更喜欢在localhost中使用distributed
调度程序。
这很简单:
$ dask-scheduler distributed.scheduler - INFO - ----------------------------------------------- distributed.scheduler - INFO - Scheduler at: 1.2.3.4:8786 distributed.scheduler - INFO - http at: 1.2.3.4:9786 distributed.bokeh.application - INFO - Web UI: http://1.2.3.4:8787/status/ distributed.scheduler - INFO - ----------------------------------------------- distributed.core - INFO - Connection from 1.2.3.4:56240 to Scheduler distributed.core - INFO - Connection from 1.2.3.4:56241 to Scheduler distributed.core - INFO - Connection from 1.2.3.4:56242 to Scheduler
$ dask-worker --nprocs 8 --nthreads 1 --memory-limit .8 1.2.3.4:8786 distributed.nanny - INFO - Start Nanny at: 127.0.0.1:61760 distributed.nanny - INFO - Start Nanny at: 127.0.0.1:61761 distributed.nanny - INFO - Start Nanny at: 127.0.0.1:61762 distributed.nanny - INFO - Start Nanny at: 127.0.0.1:61763 distributed.worker - INFO - Start worker at: 127.0.0.1:61765 distributed.worker - INFO - nanny at: 127.0.0.1:61760 distributed.worker - INFO - http at: 127.0.0.1:61764 distributed.worker - INFO - Waiting to connect to: 127.0.0.1:8786 distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Threads: 1 distributed.nanny - INFO - Start Nanny at: 127.0.0.1:61767 distributed.worker - INFO - Memory: 1.72 GB distributed.worker - INFO - Local Directory: /var/folders/55/nbg15c6j4k3cg06tjfhqypd40000gn/T/nanny-11ygswb9 ...
distributed.Client
课程提交作业。In [1]: from distributed import Client In [2]: client = Client('1.2.3.4:8786') In [3]: client <Client: scheduler="127.0.0.1:61829" processes=8 cores=8> In [4]: from distributed.diagnostics import progress In [5]: import dask.bag In [6]: data = dask.bag.range(10000, 8) In [7]: data dask.bag In [8]: future = client.compute(data.sum()) In [9]: progress(future) [########################################] | 100% Completed | 0.0s In [10]: future.result() 49995000
我发现这种方式比默认调度程序更可靠。我更喜欢显式提交任务并处理未来使用进度小部件,这在笔记本中非常好用。你也可以在等待结果时做些事情。
如果由于内存问题而出现错误,您可以重新启动工作人员或调度程序(重新开始),使用较小的数据块并重试。
更新:您可以执行此操作以杀死多处理程序调度程序启动的工作程序:
from dask.context import _globals pool = _globals.pop('pool') # remove the pool from globals to make dask create a new one pool.close() pool.terminate() pool.join()