tf.train.MonitoredTrainingSession参数

时间:2017-01-05 05:43:27

标签: python session tensorflow distributed-computing fault-tolerance

config=Nonetf.train.MonitoredTrainingSession中采用了什么参数?如何使用正确的语法指定主节点(例如localhost:2222)?

以下是我使用config = 'grpc://localhost:2222'时遇到的错误: -

Traceback (most recent call last):
  File "add_1.py", line 36, in <module>
    scaffold=None, hooks=[saver_hook, summary_hook], chief_only_hooks=None, save_checkpoint_secs=10, save_summaries_steps=None, config='grpc://localhost:2222') as sess:
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 289, in MonitoredTrainingSession
    return MonitoredSession(session_creator=session_creator, hooks=hooks)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 447, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 618, in __init__
    _WrappedSession.__init__(self, self._sess_creator.create_session())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 505, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 341, in create_session
    init_fn=self._scaffold.init_fn)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 227, in prepare_session
    config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 153, in _restore_checkpoint
    sess = session.Session(self._target, graph=self._graph, config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1186, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 540, in __init__
    % type(config))
TypeError: config must be a tf.ConfigProto, but got <type 'str'>
Exception AttributeError: "'Session' object has no attribute '_session'" in <bound method Session.__del__ of <tensorflow.python.client.session.Session object at 0x7fc540937ed0>> ignored

1 个答案:

答案 0 :(得分:2)

tf.train.MonitoredTrainingSessionconfig参数需要tf.ConfigProto协议缓冲区消息。

看起来您应该将您的参数("grpc://localhost:2222")作为master参数传递,该参数与tf.Session initializertarget参数具有相同的值:例如""表示&#34;进程内运行时&#34;和"grpc://localhost:2222"表示&#34;基于gRPC的tf.train.Server正在监听localhost:2222