我在尝试恢复经过训练的评估模型时遇到错误,但仅在评估测试集时才出错。错误是:
InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
注意,lhs shape = [2325,11]和rhs shape = [4891,11]对应于测试集中的2325个图像和训练集中的4891个图像; 11是11类的单热编码 - 所以这些可能对应于标签。当我在训练集上运行评估时,维度匹配并且没有错误结果。帮助将不胜感激!
下面的完整堆栈跟踪:
Traceback (most recent call last):
File "eval.py", line 75, in <module>
main()
File "eval.py", line 70, in main
acc_annotation, acc_retrieval = evaluate(partition="test")
File "eval.py", line 34, in evaluate
restorer.restore(sess, tf.train.latest_checkpoint(SAVED_MODEL_DIR))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1388, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
[[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]
Caused by op u'save/Assign_5', defined at:
File "eval.py", line 75, in <module>
main()
File "eval.py", line 70, in main
acc_annotation, acc_retrieval = evaluate(partition="test")
File "eval.py", line 25, in evaluate
restorer = tf.train.Saver() # For saving the model
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1000, in __init__
self.build()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 373, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 130, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
[[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]
更新
我只是查看了检查点文件中的张量形状,看起来保护程序甚至保存了模型的输入。我需要重新配置我的培训代码或以其他方式弄清楚如何从检查点中排除模型输入(标签和图像):
('tensor_name: ', 'conv2-layer/bias/Adam_1')
(512,)
('tensor_name: ', 'input/Variable_1')
(4891, 11)
('tensor_name: ', 'conv2-layer/weights_1/Adam')
(5, 1, 64, 512)
('tensor_name: ', 'conv1-layer/weights_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv2-layer/weights_1')
(5, 1, 64, 512)
('tensor_name: ', 'conv2-layer/weights_1/Adam_1')
(5, 1, 64, 512)
('tensor_name: ', 'input/Variable')
(4891, 100, 23, 1)
('tensor_name: ', 'conv1-layer/weights_1/Adam_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv1-layer/bias/Adam')
(64,)
('tensor_name: ', 'beta2_power')
()
('tensor_name: ', 'conv2-layer/bias/Adam')
(512,)
('tensor_name: ', 'conv1-layer/bias/Adam_1')
(64,)
('tensor_name: ', 'conv2-layer/bias')
(512,)
('tensor_name: ', 'conv1-layer/bias')
(64,)
('tensor_name: ', 'beta1_power')
()
('tensor_name: ', 'conv1-layer/weights_1/Adam')
(5, 23, 1, 64)
('tensor_name: ', 'Variable')
()
答案 0 :(得分:1)
查看定义模型的代码会很有用;但从我可以理解的情况来看,您似乎已将输入定义为tf.Variable
。变量是允许优化器更改的值,以便最小化损失函数。变量是模型的学习权重,这就是Tensorflow保存它们以便以后可以恢复的原因。
您应该使用tf.Placeholder
将输入数据提供给图表。