我有一个带有pandas和dask操作的笔记本。
当我没有启动客户端时,一切都按预期进行。但是一旦我启动了dask.distributed客户端,我就会在运行pandas操作的单元格中收到警告,例如pd.read_parquet('my_file')
因为我已经开始工作,所以我得到了保姆线的数量。
警告示例:
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.26s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.37s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Scheduler for 1.37s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.36s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
我想知道原因,以及如何让它们停止。
答案 0 :(得分:1)
此警告意味着Dask工作进程长时间没有响应。这很糟糕,因为工作人员无法向其他工作人员提供数据,与调度程序等交谈。即使在运行计算时也不正常,因为这些计算是在不同的线程中运行的。
这个问题有两个主要原因:
distributed.__version__ == '1.21.3'
附近修复的错误。您可能想要升级。您还可以通过增加〜/ .dask / config.yaml文件中允许的最大滴答时间来使警告静音
tick-maximum-delay: 10 s