分布式Tensorflow培训回溯:会话尚未准备好

时间:2018-09-13 03:09:50

标签: tensorflow

发生了这个严重的错误。我一直在四处搜寻,但没有任何线索。请帮忙。

在关于分布式训练的tensorflow文档之后,我指定ClusterSpec并创建分布式训练。但是,大约18小时后,以下回溯提示。我该如何解决?

Traceback (most recent call last):
  File "train.py", line 146, in <module>
    tf.app.run()
  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run 
    _sys.exit(main(argv))
  File "train.py", line 138, in main
    save_summaries_steps=10) as sess:   File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 383, in MonitoredTrainingSession    stop_grace_period_secs=stop_grace_period_secs)  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 832, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 555, in __init__    self._sess = _RecoverableSession(self._coordinated_creator)  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1018, in __init__    _WrappedSession.__init__(self, self._create_session())
  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1023, in _create_session
    return self._sess_creator.create_session()
  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 712, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 525, in create_session
    max_wait_secs=self._max_wait_secs
  File "/data/home/tf3/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 423, in wait_for_session
    "Session was not ready after waiting %d secs." % (max_wait_secs,))
tensorflow.python.framework.errors_impl.DeadlineExceededError: Session was not ready after waiting 7200 secs.

0 个答案:

没有答案