关于dask的set_index的进度报告

时间:2017-10-25 06:06:44

标签: dask dask-distributed

我正在尝试围绕整个脚本包装进度指示器。但是,set_index(..., compute=False)仍然在调度程序上运行任务,在Web界面中可以观察到。

如何报告set_index步骤的进度?

import dask.dataframe as dd
from dask.distributed import Client, progress

if __name__ == '__main__':

  with Client() as client:

    df = dd.read_csv('big.csv')

    # I can see on the web interface that something is happening.
    # This blocks 20-30s on this particular CSV.
    df = df.set_index('id', compute=False)

    # Progress reporting works from here
    out = client.compute(
      df
    )
    progress(out)

    # out.result()
    # ...

0 个答案:

没有答案