我正在Google Colab上训练两个CNN:VGG和InceptionV3。
我有10个班级,提供11614个训练样本和2884个验证样本。
这些是用于实例化模型的代码:
a)VGG(约250,000个免费参数):
base_model = vgg16.VGG16(include_top=False, weights='imagenet',
input_shape=(224, 224, 3))
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten(name='flatten', input_shape=base_model.output_shape)(x)
x = Dense(10, activation='softmax', name='predictions')(x)
model = Model(inputs=base_model.input, outputs=x)
b)InceptionV3(约20,000个免费参数):
base_model = keras.applications.inception_v3.InceptionV3 (include_top=False,
weights='imagenet', input_shape=(299, 299, 3))
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(10, activation='softmax', name='predictions')(x)
model = Model(inputs=base_model.input, outputs=x)
我每个训练了5个纪元,只是为了看结果,我发现了一个很奇怪的东西:第一个纪元大约需要小时才能完成,其余的纪要大约 3分钟< / strong>。
这些是培训的结果:
a)VGG:
Epoch 1/5
362/362 [==============================] - 6260s 17s/step - loss: 1.2611 - acc: 0.6735 - val_loss: 0.9555 - val_acc: 0.7712
Epoch 2/5
362/362 [==============================] - 159s 440ms/step - loss: 0.9351 - acc: 0.7800 - val_loss: 1.1295 - val_acc: 0.7903
Epoch 3/5
362/362 [==============================] - 156s 431ms/step - loss: 0.8751 - acc: 0.8033 - val_loss: 0.8300 - val_acc: 0.8219
Epoch 4/5
362/362 [==============================] - 155s 429ms/step - loss: 0.8482 - acc: 0.8075 - val_loss: 0.7436 - val_acc: 0.8524
Epoch 5/5
362/362 [==============================] - 155s 429ms/step - loss: 0.8031 - acc: 0.8263 - val_loss: 0.9327 - val_acc: 0.7970
b)InceptionV3:
Epoch 1/5
362/362 [==============================] - 4052s 11s/step - loss: 1.3590 - acc: 0.5542 - val_loss: 1.3469 - val_acc: 0.5455
Epoch 2/5
362/362 [==============================] - 249s 687ms/step - loss: 0.8921 - acc: 0.7115 - val_loss: 1.3225 - val_acc: 0.5519
Epoch 3/5
362/362 [==============================] - 241s 667ms/step - loss: 0.7938 - acc: 0.7347 - val_loss: 1.1960 - val_acc: 0.5999
Epoch 4/5
362/362 [==============================] - 239s 660ms/step - loss: 0.7589 - acc: 0.7416 - val_loss: 1.2979 - val_acc: 0.5593
Epoch 5/5
362/362 [==============================] - 237s 654ms/step - loss: 0.7252 - acc: 0.7505 - val_loss: 1.3122 - val_acc: 0.5565
很显然,Inception突然开始过度拟合,尽管我认为这不应该发生:要训练的参数更少,我什至添加了一个辍学层。
有什么想法为什么时代会有很大的不同?
编辑:显然我没有在其中添加密集层的Inception实例化版本中添加,但是当我进行训练时,只有一个。