培训并未提高验证数据的模型性能

时间:2019-06-10 00:59:39

标签: validation tensorflow keras loss

我试图在一个数据库上训练我的Resnet-50网络,该数据库收集5968张图像进行训练和1492张进行验证(746个类别,每班训练8个图像,每个类别验证2个图像)。我使用ImageDataGenerator flow_from_directory方法从文件夹中获取标签

我的问题是,在训练过程中,训练的准确性不断提高,而损失却不断减少,这是很好的。实际上,验证准确性非常低(约0.003),并且没有任何改善。此外,验证损失非常高,并且仍然会波动到很高的值!

这是我的代码

 import numpy as np
 from keras_preprocessing.image import ImageDataGenerator
 from keras.utils.vis_utils import plot_model
 import resnet
 import json
 from keras.callbacks import ModelCheckpoint, EarlyStopping
 import keras
 import pydot as pyd
 keras.utils.vis_utils.pydot = pyd

 data_path_l =".\\TRAIN\\left_750\\"

 test_data_path_l =".\\TEST\\left_750\\"



num_classes=746
train_images=5968
val_images=1492
batch_size=32
epochs=500

img_channels=3
img_rows=224
img_cols=224

input_imgen = ImageDataGenerator(shear_range = 0.2, 
                               zoom_range = 0.2,
                               rotation_range=5.,
                               horizontal_flip = True)

valid_imgen = ImageDataGenerator()

train_it = input_imgen.flow_from_directory(directory=data_path_l,target_size=(img_rows,img_cols),
                                      color_mode="rgb",
                                      batch_size=batch_size,
                                      class_mode="categorical",
                                      shuffle=False,
                                      )

valid_it = valid_imgen.flow_from_directory(directory=test_data_path_l,target_size=(img_rows,img_cols),
                                      color_mode="rgb",
                                      batch_size=batch_size,
                                      class_mode="categorical",
                                      shuffle=False,
                                      )

model = resnet.ResnetBuilder.build_resnet_50((img_channels, img_rows, img_cols), num_classes)
model.compile(loss='categorical_crossentropy',
          optimizer='adam',
          metrics=['accuracy'])

filepath=".\\conv2D_models\\model-{epoch:02d}-{loss:.4f}.hdf5"

mc = ModelCheckpoint(filepath, save_weights_only=False, verbose=1, 
monitor='loss', mode='min')


history=model.fit_generator(train_it,
                    steps_per_epoch= train_images // batch_size,
                    validation_data = valid_it, 
                    validation_steps = val_images // batch_size,
                    epochs = epochs,callbacks=[mc],
                    shuffle=False)
model.save('resnet2D_1sample.h5')

这是培训时期的一部分:

        Epoch 00059: saving model to .\conv2D_models\model-59-3.6342.hdf5
    Epoch 60/500
    186/186 [==============================] - 262s 1s/step - loss: 3.6074 - acc: 0.4078 - val_loss: 12.1131 - val_acc: 0.0034

    Epoch 00060: saving model to .\conv2D_models\model-60-3.6084.hdf5
    Epoch 61/500
    186/186 [==============================] - 276s 1s/step - loss: 3.5681 - acc: 0.4236 - val_loss: 12.0455 - val_acc: 0.0034

    Epoch 00061: saving model to .\conv2D_models\model-61-3.5683.hdf5
    Epoch 62/500
    186/186 [==============================] - 100s 536ms/step - loss: 3.4684 - acc: 0.4415 - val_loss: 10.2444 - val_acc: 0.0068

    Epoch 00062: saving model to .\conv2D_models\model-62-3.4674.hdf5
    Epoch 63/500
    186/186 [==============================] - 96s 516ms/step - loss: 3.4523 - acc: 0.4414 - val_loss: 11.6459 - val_acc: 0.0062

    Epoch 00063: saving model to .\conv2D_models\model-63-3.4530.hdf5
    Epoch 64/500
    186/186 [==============================] - 96s 516ms/step - loss: 3.3837 - acc: 0.4782 - val_loss: 12.3293 - val_acc: 0.0062

    Epoch 00064: saving model to .\conv2D_models\model-64-3.3847.hdf5
    Epoch 65/500
    186/186 [==============================] - 96s 515ms/step - loss: 3.2915 - acc: 0.5045 - val_loss: 12.8812 - val_acc: 0.0034

    Epoch 00065: saving model to .\conv2D_models\model-65-3.2928.hdf5
    Epoch 66/500
    186/186 [==============================] - 96s 517ms/step - loss: 3.2506 - acc: 0.5129 - val_loss: 13.2886 - val_acc: 0.0034

    Epoch 00066: saving model to .\conv2D_models\model-66-3.2527.hdf5
    Epoch 67/500
    186/186 [==============================] - 96s 515ms/step - loss: 3.2511 - acc: 0.5123 - val_loss: 14.4090 - val_acc: 0.0034

    Epoch 00067: saving model to .\conv2D_models\model-67-3.2530.hdf5
    Epoch 68/500
    186/186 [==============================] - 97s 519ms/step - loss: 3.2632 - acc: 0.5163 - val_loss: 16.2364 - val_acc: 0.0027

    Epoch 00068: saving model to .\conv2D_models\model-68-3.2650.hdf5
    Epoch 69/500
    186/186 [==============================] - 96s 517ms/step - loss: 3.1477 - acc: 0.5585 - val_loss: 16.2729 - val_acc: 0.0021

    Epoch 00069: saving model to .\conv2D_models\tmodel-69-3.1487.hdf5
    Epoch 70/500
    186/186 [==============================] - 96s 516ms/step - loss: 2.9347 - acc: 0.6099 - val_loss: 16.7732 - val_acc: 0.0014

    Epoch 00070: saving model to .\conv2D_models\model-70-2.9369.hdf5
    Epoch 71/500
    186/186 [==============================] - 96s 515ms/step - loss: 2.7118 - acc: 0.6715 - val_loss: 15.4640 - val_acc: 0.0075

    Epoch 00071: saving model to .\conv2D_models\model-71-2.7134.hdf5
    Epoch 72/500
    186/186 [==============================] - 96s 517ms/step - loss: 2.6145 - acc: 0.6835 - val_loss: 16.2367 - val_acc: 0.0055

    Epoch 00072: saving model to .\conv2D_models\model-72-2.6159.hdf5
    Epoch 73/500
    186/186 [==============================] - 96s 517ms/step - loss: 2.5492 - acc: 0.6816 - val_loss: 16.8155 - val_acc: 0.0000e+00

    Epoch 00073: saving model to .\conv2D_models\model-73-2.5503.hdf5
    Epoch 74/500
    186/186 [==============================] - 96s 516ms/step - loss: 2.5743 - acc: 0.6786 - val_loss: 14.1867 - val_acc: 0.0021

    Epoch 00074: saving model to .\conv2D_models\model-74-2.5759.hdf5
    Epoch 75/500
    186/186 [==============================] - 96s 516ms/step - loss: 2.5295 - acc: 0.6962 - val_loss: 12.3790 - val_acc: 0.0055

有人可以向我建议一些潜在的原因,导致这种奇怪的训练行为,因为它已经阻碍了我一周。

0 个答案:

没有答案