Question

我正在使用Keras内的VGG16架构，我已通过以下方式重新培训以满足我的需求：

vgg16_model = keras.applications.vgg16.VGG16()
model = Sequential()
for layer in vgg16_model.layers:
    model.add(layer)


model.layers.pop()
for layer in model.layers:
    layer.trainable = False

model.add(Dense(3, activation='softmax'))

model.compile(Adam(lr=.0001), loss='categorical_crossentropy', metrics=['accuracy'])

接下来我训练模型，然后按照keras文档中的建议方式保存整个模型：

from keras.models import load_model

model.save('my_model_vgg16.h5')  # creates a HDF5 file

加载模型时如下：

model = load_model('my_model_vgg16.h5')

在JupyterNotebook中使用经过训练的模型就像一个魅力。但是，当我在重新启动内核后尝试加载保存的模型时，我收到以下错误：

ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].

我无法弄清楚为什么会出现这种错误，因为在保存和加载过程中我既没有更改模型/图层的输入也没有输出大小。

出于测试目的，我尝试使用一个更简单的顺序模型，我在同一个pipleline中从头开始构建（即相同的保存和加载过程），这没有给我任何错误。因此，我想知道在使用预训练模型时是否存在我缺少的东西（保存并加载它）。

作为参考，整个控制台错误日志如下所示：

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
    685           graph_def_version, node_def_str, input_shapes, input_tensors,
--> 686           input_tensors_as_shapes, status)
    687   except errors.InvalidArgumentError as err:

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    472             compat.as_text(c_api.TF_Message(self.status.status)),
--> 473             c_api.TF_GetCode(self.status.status))
    474     # Delete the underlying status object from memory otherwise it stays alive

InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-5-a2d2e98db4b6> in <module>()
      1 from keras.models import load_model
----> 2 loaded_model = load_model('my_model_vgg16.h5')
      3 print("Loaded Model from disk")
      4 
      5 #compile and evaluate loaded model

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\models.py in load_model(filepath, custom_objects, compile)
    244 
    245         # set weights
--> 246         topology.load_weights_from_hdf5_group(f['model_weights'], model.layers)
    247 
    248         # Early return if compilation is not required.

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\engine\topology.py in load_weights_from_hdf5_group(f, layers)
   3164                              ' elements.')
   3165         weight_value_tuples += zip(symbolic_weights, weight_values)
-> 3166     K.batch_set_value(weight_value_tuples)
   3167 
   3168 

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\backend\tensorflow_backend.py in batch_set_value(tuples)
   2363                 assign_placeholder = tf.placeholder(tf_dtype,
   2364                                                     shape=value.shape)
-> 2365                 assign_op = x.assign(assign_placeholder)
   2366                 x._assign_placeholder = assign_placeholder
   2367                 x._assign_op = assign_op

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\variables.py in assign(self, value, use_locking)
    571       the assignment has completed.
    572     """
--> 573     return state_ops.assign(self._variable, value, use_locking=use_locking)
    574 
    575   def assign_add(self, delta, use_locking=False):

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\state_ops.py in assign(ref, value, validate_shape, use_locking, name)
    274     return gen_state_ops.assign(
    275         ref, value, use_locking=use_locking, name=name,
--> 276         validate_shape=validate_shape)
    277   return ref.assign(value)

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\gen_state_ops.py in assign(ref, value, validate_shape, use_locking, name)
     54     _, _, _op = _op_def_lib._apply_op_helper(
     55         "Assign", ref=ref, value=value, validate_shape=validate_shape,
---> 56         use_locking=use_locking, name=name)
     57     _result = _op.outputs[:]
     58     _inputs_flat = _op.inputs

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
    785         op = g.create_op(op_type_name, inputs, output_types, name=scope,
    786                          input_types=input_types, attrs=attr_protos,
--> 787                          op_def=op_def)
    788       return output_structure, op_def.is_stateful, op
    789 

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
   2956         op_def=op_def)
   2957     if compute_shapes:
-> 2958       set_shapes_for_outputs(ret)
   2959     self._add_op(ret)
   2960     self._record_op_seen_by_control_dependencies(ret)

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in set_shapes_for_outputs(op)
   2207       shape_func = _call_cpp_shape_fn_and_require_op
   2208 
-> 2209   shapes = shape_func(op)
   2210   if shapes is None:
   2211     raise RuntimeError(

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in call_with_requiring(op)
   2157 
   2158   def call_with_requiring(op):
-> 2159     return call_cpp_shape_fn(op, require_shape_fn=True)
   2160 
   2161   _call_cpp_shape_fn_and_require_op = call_with_requiring

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in call_cpp_shape_fn(op, require_shape_fn)
    625     res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
    626                                   input_tensors_as_shapes_needed,
--> 627                                   require_shape_fn)
    628     if not isinstance(res, dict):
    629       # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).

~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
    689       missing_shape_fn = True
    690     else:
--> 691       raise ValueError(err.message)
    692 
    693   if missing_shape_fn:

ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].

Answer 1

问题在于行model.layers.pop()。直接从列表model.layers弹出图层时，此模型的拓扑不会相应更新。因此，如果模型定义错误，以下所有操作都会出错。

具体来说，当您使用model.add(layer)添加图层时，列表model.outputs会更新为该图层的输出张量。您可以在Sequential.add()的源代码中找到以下行：

        output_tensor = layer(self.outputs[0])
        # ... skipping irrelevant lines
        self.outputs = [output_tensor]

但是，当您致电model.layers.pop()时，model.outputs不会相应更新。因此，将使用错误的输入张量调用下一个添加的图层（因为self.outputs[0]仍然是已删除图层的输出张量）。

这可以通过以下几行来证明：

model = Sequential()
for layer in vgg16_model.layers:
    model.add(layer)

model.layers.pop()
model.add(Dense(3, activation='softmax'))

print(model.layers[-1].input)
# => Tensor("predictions_1/Softmax:0", shape=(?, 1000), dtype=float32)
# the new layer is called on a wrong input tensor

print(model.layers[-1].kernel)
# => <tf.Variable 'dense_1/kernel:0' shape=(1000, 3) dtype=float32_ref>
# the kernel shape is also wrong

错误的内核形状是您看到有关不兼容的形状[4096,3]与[1000,3]的错误的原因。

要解决此问题，请不要将最后一层添加到Sequential模型中。

model = Sequential()
for layer in vgg16_model.layers[:-1]:
    model.add(layer)

使用Keras加载以前保存的重新训练的VGG16模型时的ValueError

1 个答案: