在Jupyter笔记本中使用TensorFlow时,我似乎无法恢复已保存的变量。我训练ANN,然后我运行saver.save(sess, "params1.ckpt")
然后再训练它,保存新结果saver.save(sess, "params2.ckpt")
但是当我运行saver.restore(sess, "params1.ckpt")
时,我的模型没有加载{{1}上保存的值并保留params1.ckpt
。
如果我运行模型,将其保存在params2.ckpt
上,然后关闭并暂停,然后尝试再次加载,我收到以下错误:
params.ckpt
我的培训代码是:
---------------------------------------------------------------------------
StatusNotOK Traceback (most recent call last)
StatusNotOK: Not found: Tensor name "Variable/Adam" not found in checkpoint files params.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_FLOAT, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
During handling of the above exception, another exception occurred:
SystemError Traceback (most recent call last)
<ipython-input-6-39ae6b7641bd> in <module>()
----> 1 saver.restore(sess, "params.ckpt")
/usr/local/lib/python3.5/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path)
889 save_path: Path where parameters were previously saved.
890 """
--> 891 sess.run([self._restore_op_name], {self._filename_tensor_name: save_path})
892
893
/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict)
366
367 # Run request and get response.
--> 368 results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
369
370 # User may have fetched the same tensor multiple times, but we
/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, target_list, fetch_list, feed_dict)
426
427 return tf_session.TF_Run(self._session, feed_dict, fetch_list,
--> 428 target_list)
429
430 except tf_session.StatusNotOK as e:
SystemError: <built-in function delete_Status> returned a result with an error set
我做错了吗?为什么我不能恢复我的变量?
答案 0 :(得分:7)
看起来您正在使用Jupyter来构建模型。在使用默认参数构造tf.Saver
时,一个可能的问题是它将使用变量的(自动生成的)名称作为检查点中的键。由于在Jupyter中很容易多次重新执行代码单元,因此您可能最终会在保存的会话中使用变量节点的多个副本。有关可能出错的说明,请参阅my answer to this question。
有一些可能的解决方案。这是最简单的:
在构建模型之前调用tf.reset_default_graph()
(以及Saver
)。这将确保变量获得您想要的名称,但它将使先前创建的图形无效。
使用tf.train.Saver()
的显式参数来指定变量的持久名称。对于你的例子,这不应该太难(虽然它对于较大的模型变得笨拙):
saver = tf.train.Saver(var_list={"b1": b1, "W1": W1, "b2": b2, "W2": W2})
创建新的tf.Graph()
并在每次创建模型时将其设为默认值。这在Jupyter中可能很棘手,因为它迫使您将所有模型构建代码放在一个单元格中,但它适用于脚本:
with tf.Graph().as_default():
# Model building and training/evaluation code goes here.