我正在尝试使用转移学习来训练我的模型,为此,我正在使用VGG16模型,剥离顶层并冻结前两层以使用imagenet初始权重。为了进行微调,我使用学习率0.0001,激活softmax,辍学0.5,损失分类交叉熵,优化程序SGD,类46。
我只是无法理解训练时的行为。火车损耗和acc都很好(损耗在减少,acc在增加)。 Val损失在减少,acc也在增加,但是它们始终高于列车损失和acc。
假设它过度拟合,我使模型变得不那么复杂,增加了辍学率,为val数据添加了更多样本,但是似乎没有任何效果。我是新手,因此可以提供任何帮助。
26137/26137 [==============================] - 7446s 285ms/step - loss: 1.1200 - accuracy: 0.3810 - val_loss: 3.1219 - val_accuracy: 0.4467
Epoch 2/50
26137/26137 [==============================] - 7435s 284ms/step - loss: 0.9944 - accuracy: 0.4353 - val_loss: 2.9348 - val_accuracy: 0.4694
Epoch 3/50
26137/26137 [==============================] - 7532s 288ms/step - loss: 0.9561 - accuracy: 0.4530 - val_loss: 1.6025 - val_accuracy: 0.4780
Epoch 4/50
26137/26137 [==============================] - 7436s 284ms/step - loss: 0.9343 - accuracy: 0.4631 - val_loss: 1.3032 - val_accuracy: 0.4860
Epoch 5/50
26137/26137 [==============================] - 7358s 282ms/step - loss: 0.9185 - accuracy: 0.4703 - val_loss: 1.4461 - val_accuracy: 0.4847
Epoch 6/50
26137/26137 [==============================] - 7396s 283ms/step - loss: 0.9083 - accuracy: 0.4748 - val_loss: 1.4093 - val_accuracy: 0.4908
Epoch 7/50
26137/26137 [==============================] - 7424s 284ms/step - loss: 0.8993 - accuracy: 0.4789 - val_loss: 1.4617 - val_accuracy: 0.4939
Epoch 8/50
26137/26137 [==============================] - 7433s 284ms/step - loss: 0.8925 - accuracy: 0.4822 - val_loss: 1.4257 - val_accuracy: 0.4978
Epoch 9/50
26137/26137 [==============================] - 7445s 285ms/step - loss: 0.8868 - accuracy: 0.4851 - val_loss: 1.5568 - val_accuracy: 0.4953
Epoch 10/50
26137/26137 [==============================] - 7387s 283ms/step - loss: 0.8816 - accuracy: 0.4874 - val_loss: 1.4534 - val_accuracy: 0.4970
Epoch 11/50
26137/26137 [==============================] - 7374s 282ms/step - loss: 0.8779 - accuracy: 0.4894 - val_loss: 1.4605 - val_accuracy: 0.4912
Epoch 12/50
26137/26137 [==============================] - 7411s 284ms/step - loss: 0.8733 - accuracy: 0.4915 - val_loss: 1.4694 - val_accuracy: 0.5030
答案 0 :(得分:0)
是的,您面临着过度拟合的问题。为了缓解这种情况,您可以尝试实施以下步骤
1。Shuffle
Data
,方法是在VGG16_model.fit中使用shuffle=True
。代码如下所示:
history = VGG16_model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1,
validation_data=(x_validation, y_validation), shuffle = True)
2。使用Early Stopping
。代码如下所示
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15)
3。使用正则化。正则化代码如下所示(您也可以尝试l1正则化或l1_l2正则化):
from tensorflow.keras.regularizers import l2
Regularizer = l2(0.001)
VGG16_model.add(Conv2D(96,11, 11, input_shape = (227,227,3),strides=(4,4), padding='valid', activation='relu', data_format='channels_last',
activity_regularizer=Regularizer, kernel_regularizer=Regularizer))
VGG16_model.add(Dense(units = 2, activation = 'sigmoid',
activity_regularizer=Regularizer, kernel_regularizer=Regularizer))
4。您可以尝试使用BatchNormalization
。
5。使用ImageDataGenerator
执行图像数据增强。有关更多信息,请参见this link。
6。如果像素不是Normalized
,则用255
除以像素值也有帮助