我建立并保存了一个TensorFlow模型,然后尝试恢复该模型并使用它。 我使用的是旧方法,原因是该代码是用旧版本的tensorflow编写的(现在我正在使用python 3.5和tensorflow 1.8.0)。
这是我保存模型的代码:
sess = tf.InteractiveSession()
..>
#build the computational graph and all the layers. for example, the 1st layer:
W_conv1 = weight_variable([first_conv_kernel_size, first_conv_kernel_size, 1, first_conv_output_channels]) # 5x5 patch, 1 input channel, 32 output channels (features)
b_conv1 = bias_variable([first_conv_output_channels])
x_image = tf.reshape(x, [-1,patch_size,patch_size,1]) # reshape x to a 4d tensor. 2,3 are the image dimensions, 4 is ine color channel
..<
sess.run(tf.initialize_all_variables())
..>
#some more code
..<
# saving the model:
saver = tf.train.Saver()
save_path = saver.save(sess, main_code_folder + 'code_files/Tensor_Flow/version1/built_networks/10 - testing_the_train_function/model.ckpt')
这就是我恢复模型的方式:
# initial parameters + build layers for tensorboard visualisation. for example, layer 1:
with tf.name_scope('conv_layer1'):
# build the first layer
with tf.name_scope('weights'):
W_conv1 = weight_variable([first_conv_kernel_size, first_conv_kernel_size, 1, first_conv_output_channels]) # 5x5 patch, 1 input channel, 32 output channels (features)
variable_summaries(W_conv1)
with tf.name_scope('biases'):
b_conv1 = bias_variable([first_conv_output_channels])
variable_summaries(b_conv1)
x_image = tf.reshape(x, [-1, patch_size, patch_size, 1]) # reshape x to a 4d tensor. 2,3 are the image dimensions, 4 is ine color channel
with tf.name_scope('Wx_plus_b'):
Wx_plus_b=conv2d(x_image, W_conv1) + b_conv1
variable_summaries(Wx_plus_b)
# apply the layers
h_conv1 = tf.nn.relu(Wx_plus_b)
...
saver = tf.train.Saver()
savepath = make_folder_name_Win_format(main_code_folder + 'code_files/Tensor_Flow/version1/built_networks/10 - testing_the_train_function/')
saver.restore(sess, save_path = savepath + '{}'.format(model_name))
运行此代码时,遇到以下错误:
tensorflow.python.framework.errors_impl.NotFoundError: Key conv_layer1/biases/Variable not found in checkpoint
我看到一些已解决的类似问题,并尝试了解决方案。没有人工作。两种代码中的目录名称都相同(据我所知,能否给我一个建议如何确认的建议?),并且模型也已正确保存(相同的注释)。
非常感谢您的帮助! 谢谢!
下面的完整错误日志:
2018-06-30 00:53:02.524332: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key conv_layer1/biases/Variable not found in checkpoint
Traceback (most recent call last):
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do_call
return fn(*args)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key conv_layer1/biases/Variable not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[Node: save/RestoreV2/_7 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_12_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Roi/Desktop/Code_Win_Ver/code_files/Tensor_Flow/version1/find_labels_for_db.py", line 252, in <module>
saver.restore(sess, save_path = savepath + '{}'.format(model_name))
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 1802, in restore
{self.saver_def.filename_tensor_name: save_path})
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 900, in run
run_metadata_ptr)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1316, in _do_run
run_metadata)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key conv_layer1/biases/Variable not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[Node: save/RestoreV2/_7 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_12_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
Caused by op 'save/RestoreV2', defined at:
File "C:/Users/Roi/Desktop/Code_Win_Ver/code_files/Tensor_Flow/version1/find_labels_for_db.py", line 247, in <module>
saver = tf.train.Saver()
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 1338, in __init__
self.build()
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 1347, in build
self._build(self._filename, build_save=True, build_restore=True)
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 1384, in _build
build_save=build_save, build_restore=build_restore)
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 835, in _build_internal
restore_sequentially, reshape)
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 472, in _AddRestoreOps
restore_sequentially)
File "C:\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 886, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "C:\Python35\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1546, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "C:\Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in create_op
op_def=op_def)
File "C:\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
NotFoundError (see above for traceback): Key conv_layer1/biases/Variable not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[Node: save/RestoreV2/_7 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_12_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
Process finished with exit code 1
答案 0 :(得分:0)
因此发生错误,因为检查点中不存在变量。要解决此问题,您的保护程序之前会创建相应的变量。
saver = tf.train.Saver()
conv_layer1 = ...
saver.restore(path=...)
现在,如果您在培训之后致电保存 或可以致电保存的任何内容。所有新添加的变量,例如conv_layer1 / biases / Variable将已经存在的变量添加到该检查点。
之后,您应该重新排列代码,以便在这些变量之后调用保护程序,这会导致问题,例如:
conv_layer1 = ...
saver = tf.train.Saver()
saver.restore(path=...)