Keras:为什么我的val_acc在Epoch 42/50突然下降?

时间:2017-04-13 09:16:12

标签: tensorflow keras

使用大约27.000个图像样本用于CNN,具有非常好的性能,但突然之间,在第42纪元,验证准确度急剧下降(从val_acc:0.9982到val_acc:0.0678)!任何的想法?我应该停止以最大val_acc训练吗?验证精度始终高于训练精度也很奇怪。

    Using TensorFlow backend.
...
27091/27067 [==============================] - 2645s - loss: 0.0120 - acc: 0.9967 - val_loss: 0.0063 - val_acc: 0.9982
Epoch 33/50
27091/27067 [==============================] - 2674s - loss: 0.0114 - acc: 0.9971 - val_loss: 0.0145 - val_acc: 0.9975
Epoch 34/50
27091/27067 [==============================] - 2654s - loss: 0.0200 - acc: 0.9962 - val_loss: 0.0063 - val_acc: 0.9979
Epoch 35/50
27091/27067 [==============================] - 2649s - loss: 0.0137 - acc: 0.9964 - val_loss: 0.0069 - val_acc: 0.9985
Epoch 36/50
27091/27067 [==============================] - 2663s - loss: 0.0161 - acc: 0.9962 - val_loss: 0.0117 - val_acc: 0.9978
Epoch 37/50
27091/27067 [==============================] - 2680s - loss: 0.0155 - acc: 0.9959 - val_loss: 0.0039 - val_acc: 0.9993
Epoch 38/50
27091/27067 [==============================] - 2660s - loss: 0.0145 - acc: 0.9965 - val_loss: 0.0117 - val_acc: 0.9973
Epoch 39/50
27091/27067 [==============================] - 2647s - loss: 0.0111 - acc: 0.9970 - val_loss: 0.0127 - val_acc: 0.9982
Epoch 40/50
27091/27067 [==============================] - 2644s - loss: 0.0112 - acc: 0.9970 - val_loss: 0.0092 - val_acc: 0.9984
Epoch 41/50
27091/27067 [==============================] - 2658s - loss: 0.0131 - acc: 0.9967 - val_loss: 0.0057 - val_acc: 0.9982
Epoch 42/50
27091/27067 [==============================] - 2662s - loss: 0.0114 - acc: 0.7715 - val_loss: 1.1921e-07 - val_acc: 0.0678
Epoch 43/50
27091/27067 [==============================] - 2661s - loss: 1.1921e-07 - acc: 0.0714 - val_loss: 1.1921e-07 - val_acc: 0.0653
Epoch 44/50
27091/27067 [==============================] - 2668s - loss: 1.1921e-07 - acc: 0.0723 - val_loss: 1.1921e-07 - val_acc: 0.0664
Epoch 45/50
27091/27067 [==============================] - 2669s - loss: 1.1921e-07 - acc: 0.0731 - val_loss: 1.1921e-07 - val_acc: 0.0683

1 个答案:

答案 0 :(得分:4)

感谢Marcin Możejko指出了正确的方向。

这可能以非常高的学习率发生 loss can start increasing after some epochs如上所述here 它有效降低了学习率,如keras回调文档中所述。

示例:

 reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                  patience=5, min_lr=0.001)
    model.fit(X_train, Y_train, callbacks=[reduce_lr])