验证损失每次迭代都会增加

时间:2020-05-12 05:54:21

标签: tensorflow image-processing neural-network classification conv-neural-network

最近,我一直在尝试进行多类分类。我的数据集包含17个图像类别。以前我使用3个转换层和2个隐藏层。结果导致我的模型过度拟合,造成大约11.0 ++的巨大验证损失,并且我的验证准确性非常低。因此,我决定将转换层减少1,将隐藏层减少1。我也删除了辍学问题,尽管我的训练准确性和损失有所改善,但仍然存在验证仍然过拟合的问题。

这是我准备好的数据集的代码:

 import cv2
    import numpy as np
    import os
    import pickle
    import random

CATEGORIES = ["apple_pie", "baklava", "caesar_salad","donuts",
              "fried_calamari", "grilled_salmon", "hamburger",
              "ice_cream", "lasagna", "macaroni_and_cheese", "nachos", "omelette","pizza",
              "risotto", "steak", "tiramisu", "waffles"]
DATALOC = "D:/Foods/Datasets"
IMAGE_SIZE = 50

data_training = []

def create_data_training():
    for category in CATEGORIES:
        path = os.path.join(DATALOC, category)
        class_num = CATEGORIES.index(category)
        for image in os.listdir(path):
            try:
                image_array = cv2.imread(os.path.join(path,image), cv2.IMREAD_GRAYSCALE)
                new_image_array = cv2.resize(image_array, (IMAGE_SIZE,IMAGE_SIZE))
                data_training.append([new_image_array,class_num])
            except Exception as exc:
                pass

create_data_training()

random.shuffle(data_training)

X = []
y = []

for features, label in data_training:
    X.append(features)
    y.append(label)

X = np.array(X).reshape(-1, IMAGE_SIZE, IMAGE_SIZE, 1)
y = np.array(y)

pickle_out = open("X.pickle", "wb")
pickle.dump(X, pickle_out)
pickle_out.close()

pickle_out = open("y.pickle", "wb")
pickle.dump(y, pickle_out)
pickle_out.close()

pickle_in = open("X.pickle","rb")
X = pickle.load(pickle_in)

这是我模型的代码:

import pickle
import tensorflow as tf
import time
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.layers import Activation, Conv2D, Dense, Dropout, Flatten, MaxPooling2D

NAME = "Foods-Model-{}".format(int(time.time()))
tensorboard = TensorBoard(log_dir='logs\{}'.format(NAME))

X = pickle.load(open("X.pickle","rb"))
y = pickle.load(open("y.pickle","rb"))

X = X/255.0

model = Sequential()
model.add(Conv2D(32,(3,3), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size =(2,2)))

model.add(Conv2D(64,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size =(2,2)))

model.add(Flatten())

model.add(Dense(128))
model.add(Activation("relu"))

model.add(Dense(17))
model.add(Activation('softmax'))

model.compile(loss = "sparse_categorical_crossentropy", optimizer = "adam", metrics = ['accuracy'])

model.fit(X, y, batch_size = 16, epochs = 20 , validation_split = 0.1, callbacks = [tensorboard])

经过训练的模型的结果:

Train on 7650 samples, validate on 850 samples
Epoch 1/20
7650/7650 [==============================] - 242s 32ms/sample - loss: 2.7826 - accuracy: 0.1024 - val_loss: 2.7018 - val_accuracy: 0.1329
Epoch 2/20
7650/7650 [==============================] - 241s 31ms/sample - loss: 2.5673 - accuracy: 0.1876 - val_loss: 2.5597 - val_accuracy: 0.2059
Epoch 3/20
7650/7650 [==============================] - 234s 31ms/sample - loss: 2.3529 - accuracy: 0.2617 - val_loss: 2.5329 - val_accuracy: 0.2153
Epoch 4/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 2.0707 - accuracy: 0.3510 - val_loss: 2.6628 - val_accuracy: 0.2059
Epoch 5/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 1.6960 - accuracy: 0.4753 - val_loss: 2.8143 - val_accuracy: 0.2047
Epoch 6/20
7650/7650 [==============================] - 230s 30ms/sample - loss: 1.2336 - accuracy: 0.6247 - val_loss: 3.3130 - val_accuracy: 0.1929
Epoch 7/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.7738 - accuracy: 0.7715 - val_loss: 3.9758 - val_accuracy: 0.1776
Epoch 8/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 0.4271 - accuracy: 0.8827 - val_loss: 4.7325 - val_accuracy: 0.1882
Epoch 9/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.2080 - accuracy: 0.9519 - val_loss: 5.7198 - val_accuracy: 0.1918
Epoch 10/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.1402 - accuracy: 0.9668 - val_loss: 6.0608 - val_accuracy: 0.1835
Epoch 11/20
7650/7650 [==============================] - 236s 31ms/sample - loss: 0.0724 - accuracy: 0.9872 - val_loss: 6.7468 - val_accuracy: 0.1753
Epoch 12/20
7650/7650 [==============================] - 232s 30ms/sample - loss: 0.0549 - accuracy: 0.9895 - val_loss: 7.4844 - val_accuracy: 0.1718
Epoch 13/20
7650/7650 [==============================] - 229s 30ms/sample - loss: 0.1541 - accuracy: 0.9591 - val_loss: 7.3335 - val_accuracy: 0.1553
Epoch 14/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 0.0477 - accuracy: 0.9905 - val_loss: 7.8453 - val_accuracy: 0.1729
Epoch 15/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.0346 - accuracy: 0.9908 - val_loss: 8.1847 - val_accuracy: 0.1753
Epoch 16/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 0.0657 - accuracy: 0.9833 - val_loss: 7.8582 - val_accuracy: 0.1624
Epoch 17/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.0555 - accuracy: 0.9830 - val_loss: 8.2578 - val_accuracy: 0.1553
Epoch 18/20
7650/7650 [==============================] - 230s 30ms/sample - loss: 0.0423 - accuracy: 0.9892 - val_loss: 8.6970 - val_accuracy: 0.1694
Epoch 19/20
7650/7650 [==============================] - 236s 31ms/sample - loss: 0.0291 - accuracy: 0.9927 - val_loss: 8.5275 - val_accuracy: 0.1882
Epoch 20/20
7650/7650 [==============================] - 234s 31ms/sample - loss: 0.0443 - accuracy: 0.9873 - val_loss: 9.2703 - val_accuracy: 0.1812

谢谢您的时间。任何帮助和建议将不胜感激。

1 个答案:

答案 0 :(得分:1)

您的模型建议过早拟合。

  1. 完全摆脱密集层并使用全局池。
model = Sequential()
model.add(Conv2D(32,(3,3), input_shape = X.shape[1:]))
model.add(Activation("relu"))

model.add(Conv2D(64,(3,3)))
model.add(Activation("relu"))

model.add(Conv2D(128,(3,3)))
model.add(Activation("relu"))

model.add(GlobalAveragePooling2D())


model.add(Dense(17))
model.add(Activation('softmax'))

model.summary()
  1. 在转化层之后使用SpatialDropout2D

ref:https://www.tensorflow.org/api_docs/python/tf/keras/layers/SpatialDropout2D

  1. 使用提早停止获取平衡模型。

  2. 您的输出表明categorical_crossentropy是更好的损失。