我在使用分布式的LocalCluster时遇到问题,并试图生成一个最小的示例。但是我什至没有做到这一点(我正在使用python 3.6,分布式1.23.3,龙卷风5.1.1)。如果我创建一个包含以下内容的python文件test.py
:
from distributed import LocalCluster
cluster = LocalCluster()
并使用以下命令执行文件:
python test.py
我收到很长的错误消息:
...
distributed.nanny - WARNING - Worker process 841 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 217, in _start_worker
raise gen.TimeoutError("Worker failed to start")
tornado.util.TimeoutError: Worker failed to start
Traceback (most recent call last):
File "test.py", line 7, in <module>
cluster = LocalCluster(n_workers=2)
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 141, in __init__
self.start(ip=ip, n_workers=n_workers)
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 171, in start
self.sync(self._start, **kwargs)
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 164, in sync
return sync(self.loop, func, *args, **kwargs)
File "/home/user/venv/lib/python3.6/site-packages/distributed/utils.py", line 277, in sync
six.reraise(*error[0])
File "/home/user/venv/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/home/user/venv/lib/python3.6/site-packages/distributed/utils.py", line 262, in f
result[0] = yield future
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(**self.worker_kwargs) for i in range(n_workers)]
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "/home/user/venv/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 217, in _start_worker
raise gen.TimeoutError("Worker failed to start")
tornado.util.TimeoutError: Worker failed to start
distributed.process - WARNING - reaping stray process <ForkServerProcess(ForkServerProcess-10, started daemon)>
distributed.nanny - WARNING - Worker process 849 was killed by unknown signal
/usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 12 leaked semaphores to clean up at shutdown
len(cache))
Exception ignored in: <bound method LocalCluster.__del__ of LocalCluster('tcp://127.0.0.1:38903', workers=0, ncores=0)>
Traceback (most recent call last):
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 340, in __del__
File "/home/user/venv/lib/python3.6/site-packages/distributed/deploy/local.py", line 291, in close
File "/home/user/venv/lib/python3.6/site-packages/distributed/utils.py", line 427, in run_sync
File "/home/user/venv/lib/python3.6/site-packages/distributed/utils.py", line 272, in sync
tornado.util.TimeoutError: timed out after 20 s.
奇怪的是,如果我只打开python shell并在两行中键入内容,一切都会按预期进行。
更新:
将所有内容都放入函数中可以正常工作
:from distributed import LocalCluster
def main():
cluster = LocalCluster()
if __name__ == "__main__":
main()
仍然困惑为什么第一个版本失败。