Keras:精度在开始调整时下降

时间:2018-09-11 18:28:24

标签: python tensorflow machine-learning keras deep-learning

我在用Keras调整Inception模型时遇到麻烦。

我设法使用教程和文档生成了一个完全连接的顶层模型,该模型使用Inception的瓶颈功能将数据集分类为适当的类别,其准确率超过99%。

import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications


# dimensions of our images.
img_width, img_height = 150, 150

#paths for saving weights and finding datasets
top_model_weights_path = 'Inception_fc_model_v0.h5'
train_data_dir = '../data/train2'
validation_data_dir = '../data/train2' 

#training related parameters?
inclusive_images = 1424
nb_train_samples = 1424
nb_validation_samples = 1424
epochs = 50
batch_size = 16


def save_bottlebeck_features():
    datagen = ImageDataGenerator(rescale=1. / 255)

    # build bottleneck features
    model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_shape=(img_width,img_height,3))

    generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='categorical',
        shuffle=False)

    bottleneck_features_train = model.predict_generator(
        generator, nb_train_samples // batch_size)

    np.save('bottleneck_features_train', bottleneck_features_train)

    generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='categorical',
        shuffle=False)

    bottleneck_features_validation = model.predict_generator(
        generator, nb_validation_samples // batch_size)

    np.save('bottleneck_features_validation', bottleneck_features_validation)

def train_top_model():
    train_data = np.load('bottleneck_features_train.npy')
    train_labels = np.array(range(inclusive_images))

    validation_data = np.load('bottleneck_features_validation.npy')
    validation_labels = np.array(range(inclusive_images))

    print('base size ', train_data.shape[1:])

    model = Sequential()
    model.add(Flatten(input_shape=train_data.shape[1:]))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(inclusive_images, activation='softmax'))
    model.compile(loss='sparse_categorical_crossentropy',
             optimizer='Adam',
             metrics=['accuracy'])

    proceed = True

    #model.load_weights(top_model_weights_path)

    while proceed:
        history = model.fit(train_data, train_labels,
              epochs=epochs,
              batch_size=batch_size)#,
              #validation_data=(validation_data, validation_labels), verbose=1)
        if history.history['acc'][-1] > .99:
            proceed = False

    model.save_weights(top_model_weights_path)


save_bottlebeck_features()
train_top_model()
  

第50/50集   1424/1424 [==============================]-17s 12ms / step-损失:0.0398-acc:0.9909

我还能够在开始时就将此模型堆叠起来以创建我的完整模型,并使用该完整模型成功地对我的训练集进行分类。

from keras import Model
from keras import optimizers
from keras.callbacks import EarlyStopping

img_width, img_height = 150, 150

top_model_weights_path = 'Inception_fc_model_v0.h5'
train_data_dir = '../data/train2'
validation_data_dir = '../data/train2' 

#how many inclusive examples do we have?
inclusive_images = 1424
nb_train_samples = 1424
nb_validation_samples = 1424
epochs = 50
batch_size = 16

# build the complete network for evaluation
base_model = applications.inception_v3.InceptionV3(weights='imagenet', include_top=False, input_shape=(img_width,img_height,3))

top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(1000, activation='relu'))
top_model.add(Dense(inclusive_images, activation='softmax'))

top_model.load_weights(top_model_weights_path)

#combine base and top model
fullModel = Model(input= base_model.input, output= top_model(base_model.output))

#predict with the full training dataset
results = fullModel.predict_generator(ImageDataGenerator(rescale=1. / 255).flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='categorical',
        shuffle=False))

在此完整模型上处理的结果的检查与生成的完全连接模型的瓶颈的准确性相匹配。

import matplotlib.pyplot as plt
import operator

#retrieve what the softmax based class assignments would be from results
resultMaxClassIDs = [ max(enumerate(result), key=operator.itemgetter(1))[0] for result in results]

#resultMaxClassIDs should be equal to range(inclusive_images) so we subtract the two and plot the log of the absolute value 
#looking for spikes that indicate the values aren't equal 
plt.plot([np.log(np.abs(x)+10) for x in (np.array(resultMaxClassIDs) - np.array(range(inclusive_images)))])

results: spikes are misclassifications

这是问题所在: 当我采用此完整模型并尝试对其进行训练时,即使验证仍保持在99%以上,准确性也会降至0。

model2 = fullModel

for layer in model2.layers[:-2]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
#model.compile(loss='binary_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),  metrics=['accuracy'])

model2.compile(loss='categorical_crossentropy',
             optimizer=optimizers.SGD(lr=1e-4, momentum=0.9), 
             metrics=['accuracy'])

train_datagen = ImageDataGenerator(rescale=1. / 255)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')

callback = [EarlyStopping(monitor='acc', min_delta=0, patience=3, verbose=0, mode='auto', baseline=None)]
# fine-tune the model
model2.fit_generator(
    #train_generator,
    validation_generator,
    steps_per_epoch=nb_train_samples//batch_size,
    validation_steps = nb_validation_samples//batch_size,
    epochs=epochs,
    validation_data=validation_generator)
  

历次1/50   89/89 [==============================]-388s 4s / step-损耗:13.5787-acc:0.0000e + 00-val_loss:0.0353-val_acc:0.9937

随着事情的进展,情况变得越来越糟

  

第21/50集   89/89 [==============================]-372s 4s / step-损耗:7.3850-acc:0.0035-val_loss :0.5813-val_acc:0.8272

我唯一能想到的是,在最后一趟火车上,训练标签被不正确地分配了,但是我之前已经使用VGG16用类似的代码成功地做到了这一点。

我在代码中进行了搜索,试图找出一个差异,以解释为什么模型在99%的时间内进行准确预测会降低其训练精度,同时又在微调过程中保持验证精度,但我无法弄清楚。任何帮助将不胜感激。

有关代码和环境的信息:

有些奇怪的事会出现,但事实就是这样:

  • 每个班级只有1张图片。该NN旨在进行分类 环境和定向条件为 受控。它们只是每个班级的一张可接受的图像 对应正确的环境和轮换情况。
  • 测试集和验证集相同。这个NN只有 设计用于正在接受培训的课程。图像 它将处理该类实例的复本。是我的 意图使模型不适用于这些类

我正在使用:

  • Windows 10
  • Anaconda客户端1.6.14下的Python 3.5.6
  • Keras 2.2.2
  • Tensorflow 1.10.0作为后端
  • CUDA 9.0
  • CuDNN 8.0

我已签出:

  1. Keras accuracy discrepancy in fine-tuned model
  2. VGG16 Keras fine tuning: low accuracy
  3. Keras: model accuracy drops after reaching 99 percent accuracy and loss 0.01
  4. Keras inception v3 retraining and finetuning error
  5. How to find which version of TensorFlow is installed in my system?

但它们似乎无关。

2 个答案:

答案 0 :(得分:7)

注意:由于您的问题有点奇怪,而且如果没有训练有素的模型和数据集就很难调试,因此,在考虑了许多可能会出错的问题之后,这个答案只是(最佳)猜测。请提供您的反馈意见,如果该答案无效,我将删除它。

由于inception_V3包含BatchNormalization层,因此问题可能是由于将trainable参数设置为False({{3} },123)。

现在,让我们看看这是否是问题的根源:4,在定义模型进行微调时设置学习阶段:

from keras import backend as K

K.set_learning_phase(0)

base_model = applications.inception_v3.InceptionV3(weights='imagenet', include_top=False, input_shape=(img_width,img_height,3))

for layer in base_model.layers:
    layer.trainable = False

K.set_learning_phase(1)

top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(1000, activation='relu'))
top_model.add(Dense(inclusive_images, activation='softmax'))

top_model.load_weights(top_model_weights_path)

#combine base and top model
fullModel = Model(input= base_model.input, output= top_model(base_model.output))

fullModel.compile(loss='categorical_crossentropy',
             optimizer=optimizers.SGD(lr=1e-4, momentum=0.9), 
             metrics=['accuracy'])


#####################################################################
# Here, define the generators and then fit the model same as before #
#####################################################################

侧面说明:这不会对您造成任何问题,但请记住,当您使用top_model(base_model.output)时,会存储整个顺序模型(即top_model)作为fullModel的一层。您可以使用fullModel.summary()print(fullModel.layers[-1])进行验证。因此,当您使用时:

for layer in model2.layers[:-2]:
    layer.trainable = False 

您实际上也没有冻结base_model的最后一层。但是,由于它是Concatenate层,因此没有可训练的参数,因此不会发生任何问题,并且它将按您预期的方式运行。

答案 1 :(得分:1)

和上一个回复一样,我会尝试分享一些想法,看看是否有帮助。

有几件事引起了我的注意(也许值得回顾)。注意:其中一些应该也给您带来了单独模型的问题。

  • 如果我错了,请更正,但似乎您在第一次训练中使用了sparse_categorical_crossentropy,而第二次训练中使用了categorical_crossentropy。这是对的吗?因为我相信它们假定标签的方式不同(稀疏假定为整数,另一个假定为一)。
  • 您是否尝试过将最后添加的图层设置为trainable = True?我知道您已经将其他设置为trainable = False,但这也许也值得一试。
  • 似乎数据生成器没有使用Inception v3中使用的默认预处理功能,该功能使用平均渠道。
  • 您是否尝试过使用Functional而不是Sequential API进行任何实验?

我希望有帮助。