第一个时期比其他时期更长

时间:2019-06-13 07:43:44

标签: python tensorflow keras google-colaboratory

我正在Google Colab上训练两个CNN:VGG和InceptionV3。
我有10个班级,提供11614个训练样本和2884个验证样本。

这些是用于实例化模型的代码:
a)VGG(约250,000个免费参数):

    base_model = vgg16.VGG16(include_top=False, weights='imagenet',
                             input_shape=(224, 224, 3))
    for layer in base_model.layers:
      layer.trainable = False

    x = base_model.output
    x = Flatten(name='flatten', input_shape=base_model.output_shape)(x) 
    x = Dense(10, activation='softmax', name='predictions')(x)
    model = Model(inputs=base_model.input, outputs=x)

b)InceptionV3(约20,000个免费参数):

    base_model = keras.applications.inception_v3.InceptionV3 (include_top=False,
                      weights='imagenet', input_shape=(299, 299, 3))

    for layer in base_model.layers:
      layer.trainable = False

    x = base_model.output
    x = GlobalAveragePooling2D()(x) 
    x = Dense(10, activation='softmax', name='predictions')(x)
    model = Model(inputs=base_model.input, outputs=x)

我每个训练了5个纪元,只是为了看结果,我发现了一个很奇怪的东西:第一个纪元大约需要小时才能完成,其余的纪要大约 3分钟< / strong>。

这些是培训的结果:
a)VGG:

    Epoch 1/5
    362/362 [==============================] - 6260s 17s/step - loss: 1.2611 - acc: 0.6735 - val_loss: 0.9555 - val_acc: 0.7712
    Epoch 2/5
    362/362 [==============================] - 159s 440ms/step - loss: 0.9351 - acc: 0.7800 - val_loss: 1.1295 - val_acc: 0.7903
    Epoch 3/5
    362/362 [==============================] - 156s 431ms/step - loss: 0.8751 - acc: 0.8033 - val_loss: 0.8300 - val_acc: 0.8219
    Epoch 4/5
    362/362 [==============================] - 155s 429ms/step - loss: 0.8482 - acc: 0.8075 - val_loss: 0.7436 - val_acc: 0.8524
    Epoch 5/5
    362/362 [==============================] - 155s 429ms/step - loss: 0.8031 - acc: 0.8263 - val_loss: 0.9327 - val_acc: 0.7970

b)InceptionV3:

    Epoch 1/5
    362/362 [==============================] - 4052s 11s/step - loss: 1.3590 - acc: 0.5542 - val_loss: 1.3469 - val_acc: 0.5455
    Epoch 2/5
    362/362 [==============================] - 249s 687ms/step - loss: 0.8921 - acc: 0.7115 - val_loss: 1.3225 - val_acc: 0.5519
    Epoch 3/5
    362/362 [==============================] - 241s 667ms/step - loss: 0.7938 - acc: 0.7347 - val_loss: 1.1960 - val_acc: 0.5999
    Epoch 4/5
    362/362 [==============================] - 239s 660ms/step - loss: 0.7589 - acc: 0.7416 - val_loss: 1.2979 - val_acc: 0.5593
    Epoch 5/5
    362/362 [==============================] - 237s 654ms/step - loss: 0.7252 - acc: 0.7505 - val_loss: 1.3122 - val_acc: 0.5565

很显然,Inception突然开始过度拟合,尽管我认为这不应该发生:要训练的参数更少,我什至添加了一个辍学层。

有什么想法为什么时代会有很大的不同?

编辑:显然我没有在其中添加密集层的Inception实例化版本中添加,但是当我进行训练时,只有一个。

0 个答案:

没有答案