我正在使用Keras内的VGG16架构,我已通过以下方式重新培训以满足我的需求:
vgg16_model = keras.applications.vgg16.VGG16()
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
model.layers.pop()
for layer in model.layers:
layer.trainable = False
model.add(Dense(3, activation='softmax'))
model.compile(Adam(lr=.0001), loss='categorical_crossentropy', metrics=['accuracy'])
接下来我训练模型,然后按照keras文档中的建议方式保存整个模型:
from keras.models import load_model
model.save('my_model_vgg16.h5') # creates a HDF5 file
加载模型时如下:
model = load_model('my_model_vgg16.h5')
在JupyterNotebook中使用经过训练的模型就像一个魅力。但是,当我在重新启动内核后尝试加载保存的模型时,我收到以下错误:
ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
我无法弄清楚为什么会出现这种错误,因为在保存和加载过程中我既没有更改模型/图层的输入也没有输出大小。
出于测试目的,我尝试使用一个更简单的顺序模型,我在同一个pipleline中从头开始构建(即相同的保存和加载过程),这没有给我任何错误。因此,我想知道在使用预训练模型时是否存在我缺少的东西(保存并加载它)。
作为参考,整个控制台错误日志如下所示:
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
685 graph_def_version, node_def_str, input_shapes, input_tensors,
--> 686 input_tensors_as_shapes, status)
687 except errors.InvalidArgumentError as err:
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
472 compat.as_text(c_api.TF_Message(self.status.status)),
--> 473 c_api.TF_GetCode(self.status.status))
474 # Delete the underlying status object from memory otherwise it stays alive
InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-5-a2d2e98db4b6> in <module>()
1 from keras.models import load_model
----> 2 loaded_model = load_model('my_model_vgg16.h5')
3 print("Loaded Model from disk")
4
5 #compile and evaluate loaded model
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\models.py in load_model(filepath, custom_objects, compile)
244
245 # set weights
--> 246 topology.load_weights_from_hdf5_group(f['model_weights'], model.layers)
247
248 # Early return if compilation is not required.
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\engine\topology.py in load_weights_from_hdf5_group(f, layers)
3164 ' elements.')
3165 weight_value_tuples += zip(symbolic_weights, weight_values)
-> 3166 K.batch_set_value(weight_value_tuples)
3167
3168
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\backend\tensorflow_backend.py in batch_set_value(tuples)
2363 assign_placeholder = tf.placeholder(tf_dtype,
2364 shape=value.shape)
-> 2365 assign_op = x.assign(assign_placeholder)
2366 x._assign_placeholder = assign_placeholder
2367 x._assign_op = assign_op
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\variables.py in assign(self, value, use_locking)
571 the assignment has completed.
572 """
--> 573 return state_ops.assign(self._variable, value, use_locking=use_locking)
574
575 def assign_add(self, delta, use_locking=False):
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\state_ops.py in assign(ref, value, validate_shape, use_locking, name)
274 return gen_state_ops.assign(
275 ref, value, use_locking=use_locking, name=name,
--> 276 validate_shape=validate_shape)
277 return ref.assign(value)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\gen_state_ops.py in assign(ref, value, validate_shape, use_locking, name)
54 _, _, _op = _op_def_lib._apply_op_helper(
55 "Assign", ref=ref, value=value, validate_shape=validate_shape,
---> 56 use_locking=use_locking, name=name)
57 _result = _op.outputs[:]
58 _inputs_flat = _op.inputs
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
785 op = g.create_op(op_type_name, inputs, output_types, name=scope,
786 input_types=input_types, attrs=attr_protos,
--> 787 op_def=op_def)
788 return output_structure, op_def.is_stateful, op
789
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
2956 op_def=op_def)
2957 if compute_shapes:
-> 2958 set_shapes_for_outputs(ret)
2959 self._add_op(ret)
2960 self._record_op_seen_by_control_dependencies(ret)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in set_shapes_for_outputs(op)
2207 shape_func = _call_cpp_shape_fn_and_require_op
2208
-> 2209 shapes = shape_func(op)
2210 if shapes is None:
2211 raise RuntimeError(
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in call_with_requiring(op)
2157
2158 def call_with_requiring(op):
-> 2159 return call_cpp_shape_fn(op, require_shape_fn=True)
2160
2161 _call_cpp_shape_fn_and_require_op = call_with_requiring
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in call_cpp_shape_fn(op, require_shape_fn)
625 res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
626 input_tensors_as_shapes_needed,
--> 627 require_shape_fn)
628 if not isinstance(res, dict):
629 # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
689 missing_shape_fn = True
690 else:
--> 691 raise ValueError(err.message)
692
693 if missing_shape_fn:
ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
答案 0 :(得分:3)
问题在于行model.layers.pop()
。直接从列表model.layers
弹出图层时,此模型的拓扑不会相应更新。因此,如果模型定义错误,以下所有操作都会出错。
具体来说,当您使用model.add(layer)
添加图层时,列表model.outputs
会更新为该图层的输出张量。您可以在Sequential.add()
的源代码中找到以下行:
output_tensor = layer(self.outputs[0])
# ... skipping irrelevant lines
self.outputs = [output_tensor]
但是,当您致电model.layers.pop()
时,model.outputs
不会相应更新。因此,将使用错误的输入张量调用下一个添加的图层(因为self.outputs[0]
仍然是已删除图层的输出张量)。
这可以通过以下几行来证明:
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
model.layers.pop()
model.add(Dense(3, activation='softmax'))
print(model.layers[-1].input)
# => Tensor("predictions_1/Softmax:0", shape=(?, 1000), dtype=float32)
# the new layer is called on a wrong input tensor
print(model.layers[-1].kernel)
# => <tf.Variable 'dense_1/kernel:0' shape=(1000, 3) dtype=float32_ref>
# the kernel shape is also wrong
错误的内核形状是您看到有关不兼容的形状[4096,3]
与[1000,3]
的错误的原因。
要解决此问题,请不要将最后一层添加到Sequential
模型中。
model = Sequential()
for layer in vgg16_model.layers[:-1]:
model.add(layer)