转移学习InceptionV3在培训中显示出较差的验证准确性

时间:2020-10-04 03:05:17

标签: tensorflow tf.keras

## MODEL IMPORTING ##

import tensorflow 
import pandas as pd
import numpy as np
import os
import keras
import random
import cv2
import math
import seaborn as sns

from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

from tensorflow.keras.layers import Dense,GlobalAveragePooling2D,Convolution2D,BatchNormalization
from tensorflow.keras.layers import Flatten,MaxPooling2D,Dropout

from tensorflow.keras.applications import InceptionV3
from tensorflow.keras.applications.densenet import preprocess_input

from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator,img_to_array

from tensorflow.keras.models import Model

from tensorflow.keras.optimizers import Adam

from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau

import warnings
warnings.filterwarnings("ignore")

WIDTH = 299
HEIGHT = 299

CLASSES = 4

base_model = InceptionV3(weights='imagenet', include_top=False)

for layer in base_model.layers:
     layer.trainable = False

x = base_model.output
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dropout(0.4)(x)
predictions = Dense(CLASSES, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

model.summary()

model.compile(optimizer='adam',  ##also tried other optimiser --> similar poor accuracy found
          loss='categorical_crossentropy',
          metrics=['accuracy'])


## IMAGE DATA GENERATOR ##

from keras.applications.inception_v3 import preprocess_input
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
     rotation_range=40,
        width_shift_range=0.2,
       height_shift_range=0.2,
       shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest',
      validation_split=0.2)



train_generator = train_datagen.flow_from_directory(
   TRAIN_DIR,
    target_size=(HEIGHT, WIDTH),
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset="training")

validation_generator = train_datagen.flow_from_directory(
    TRAIN_DIR,
     target_size=(HEIGHT, WIDTH),
     batch_size=BATCH_SIZE,
     class_mode='categorical',
     subset="validation")


test_datagen = ImageDataGenerator(rescale=1./255)

generator_test = test_datagen.flow_from_directory(directory=TEST_DIR,
                                              target_size=(HEIGHT, WIDTH),
                                              batch_size=BATCH_SIZE,
                                              class_mode='categorical',
                                              shuffle=False)

## MODEL training ##

EPOCHS = 20
STEPS_PER_EPOCH = 320 #train_generator.n//train_generator.batch_size
VALIDATION_STEPS = 64 #validation_generator.n//validation_generator.batch_size
history = model.fit_generator(
    train_generator,
   epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_data=validation_generator,
validation_steps=VALIDATION_STEPS)

找到结果:

验证准确性在0.55-0.67左右波动。 培训准确性0.99

问题:

  1. 迁移学习过程中的问题/位置在哪里?
  2. 是否正确选择了训练,验证和测试数据生成器功能参数?

1 个答案:

答案 0 :(得分:0)

嗯,我认为您最好训练整个模型。因此,删除使基础模型层不可训练的代码。如果您查看位于here的Inceptionv3的文档,则可以设置pooling ='max',这会将GlobalMaxPooling2d层作为输出层,因此,您无需像添加操作那样添加自己的层。现在我注意到您导入了回调ModelCheckpoint和ReduceLROnPlateau,但是您没有在model.fit中使用它们。使用可调节的学习率将有利于降低验证损失。 ModelCheckpoint对于保存最佳模型以用于预测非常有用。有关实现,请参见下面的代码。 save_loc是您要存储来自ModelCheckpoint的结果的目录。注意在ModelCheckpoint中,我设置了save_weights_only = True。原因是,这比在验证损失减少的每个时期保存整个模型要快得多。

checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath=save_loc, monitor='val_loss', verbose=1, save_best_only=True,
        save_weights_only=True, mode='auto', save_freq='epoch', options=None)
lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=1, mode="auto",
        min_delta=0.00001,  cooldown=0,  min_lr=0)
callbacks=[checkpoint, lr_adjust]
history = model.fit_generator( train_generator, epochs=EPOCHS,
          steps_per_epoch=STEPS_PER_EPOCH,validation_data=validation_generator,
          validation_steps=VALIDATION_STEPS, callbacks=callbacks)
model.load_weights(save_loc) # load the saved weights
# after this use the model to evaluate or predict on the test set.
# if you are satisfied with the results you can then save the entire model with
model.save(save_loc)

在测试集生成器上要小心。确保对测试数据进行与训练数据相同的预处理。我注意到您只重新缩放了像素。不知道预处理功能做什么,但是我会用它。 我还将首先删除辍学层。监视每个时期的训练损失和验证损失,并绘制结果图。如果训练损失继续减少,并且验证损失趋向于增加,那么如果您过度适应,那么需要时恢复辍学层。如果您对测试集进行评估或预测,则只需要遍历测试集一次。因此,选择测试批次大小为否。测试样本/测试批次大小的整数是整数,并使用该整数作为测试步骤数。这里有一个 方便的功能,它将根据您的存储容量确定其中的长度是测试样本数,b_max是允许的最大批处理大小。

def get_bs(length,b_max):
    batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=b_max],reverse=True)[0]  
    return batch_size,int(length/batch_size)
# example of use
batch_size, step=get_bs(10000,70)
#result is batch_size= 50  steps=200