我正在训练使用tensorflow.keras
库进行图像分类的迁移学习模型。我正在上7门课。我使用了一些在“ imagenet”上训练的预训练模型。例如,我正在使用Xception
并解冻最后30层。当我将最后一层替换为softmax
层时,训练看起来很合理。模型摘要如下所示:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
xception (Model) (None, 5, 5, 2048) 20861480
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048) 0
_________________________________________________________________
dense (Dense) (None, 7) 14343
=================================================================
Total params: 20,875,823
Trainable params: 8,990,679
Non-trainable params: 11,885,144
_________________________________________________________________
Number of layers in the base model: 132
Number of trainable layers in the full model: 3
前两个时期的训练看起来像(训练准确性从低准确性(大约10%左右)开始):
Epoch 1/12
525/525 [==============================] - 190s 362ms/step - loss: 0.7438 - accuracy: 0.7314 - val_loss: 0.3813 - val_accuracy: 0.8648
Epoch 2/12
257/525 [=============>................] - ETA: 1:28 - loss: 0.4986 - accuracy: 0.8182
当我在最后一个softmax
层之前添加一个或多个新层时,问题就开始了。训练准确度从 85.71 开始。 添加多少层或使用什么基本模型都没有关系。训练准确度从85.71开始,在第一个时期之后,我获得了非常高的验证集准确度(接近94-95)。在短短的时间内,验证准确性超过98%。 但是,当我在单独的验证集上对其进行测试时,与之前描述的模型相比,其性能实际上会更差。
例如,模型摘要为:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
xception (Model) (None, 5, 5, 2048) 20861480
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048) 0
_________________________________________________________________
dense (Dense) (None, 1024) 2098176
_________________________________________________________________
dropout (Dropout) (None, 1024) 0
_________________________________________________________________
dense_1 (Dense) (None, 7) 7175
=================================================================
Total params: 22,966,831
Trainable params: 11,081,687
Non-trainable params: 11,885,144
_________________________________________________________________
Number of layers in the base model: 132
Number of trainable layers in the full model: 5
培训纪元如下:
Epoch 1/12
525/525 [==============================] - 191s 363ms/step - loss: 0.1902 - accuracy: 0.9225 - val_loss: 0.0971 - val_accuracy: 0.9640
Epoch 2/12
58/525 [==>...........................] - ETA: 2:38 - loss: 0.1351 - accuracy: 0.9470
我用不同的模型更改了基本模型,在softmax
之前添加/删除了图层,但是训练精度始终从85.71开始。