在小型DataSet健全性检查中过度拟合?

时间:2018-12-17 19:13:34

标签: python machine-learning keras

我正在使用预训练的Inception V3模型对两个类别进行图像分类。出于完整性检查的考虑,我已将模型过度拟合了20张图像的小型数据集。训练结果似乎过分合适,但是我不确定验证准确性和损失的预期输出。如何正确执行此健全性检查以查看我的模型是否正常工作?

enter image description here

data = np.array(data, dtype="float")/255.0
labels = np.array(labels,dtype ="uint8")

#test_size is percentage to split into test/train data
(trainX, testX, trainY, testY) = train_test_split(
                            data,labels, 
                            test_size=0.2, 
                            random_state=42) 

img_width, img_height = 299, 299 #InceptionV3 size

epochs =  25
batch_size = 64

#include_top = false to accomodate new classes 
base_model = keras.applications.InceptionV3(
        weights ='imagenet',
        include_top=False, 
        input_shape = (img_width,img_height,3))

#Classifier Model ontop of Convolutional Model
model_top = keras.models.Sequential()
model_top.add(keras.layers.GlobalAveragePooling2D(input_shape=base_model.output_shape[1:], data_format=None)),
model_top.add(keras.layers.Dense(350,activation='relu'))
#model_top.add(keras.layers.Dropout(0.4))
model_top.add(keras.layers.Dense(1,activation = 'sigmoid'     
model = keras.models.Model(inputs = base_model.input, outputs = model_top(base_model.output))

model.compile(optimizer = keras.optimizers.Adam(
                    lr=0.00001,
                    beta_1=0.9,
                    beta_2=0.999,
                    epsilon=1e-08),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])

train_datagen = keras.preprocessing.image.ImageDataGenerator(
          zoom_range = 0.05,
          width_shift_range = 0.05, 
          height_shift_range = 0.05,
          horizontal_flip = True,
          vertical_flip = True,
          fill_mode ='nearest') 


val_datagen = keras.preprocessing.image.ImageDataGenerator()


train_generator = train_datagen.flow(
        trainX, 
        trainY,
        batch_size=batch_size)

validation_generator = val_datagen.flow(
                testX,
                testY,
                batch_size=batch_size)

1 个答案:

答案 0 :(得分:0)

将数据集拆分为测试,开发和培训。查看与训练集相比,开发集的准确性和损失有何变化。为此,您可以从Keras的Tensorboard callback获得帮助:

tbCallBack = keras.callbacks.TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=True)
...
model.fit(...inputs and parameters..., callbacks=[tbCallBack])

enter image description here

您也可以使用原始纸张检查准确性。