我在用Keras调整Inception模型时遇到麻烦。
我设法使用教程和文档生成了一个完全连接的顶层模型,该模型使用Inception的瓶颈功能将数据集分类为适当的类别,其准确率超过99%。
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
# dimensions of our images.
img_width, img_height = 150, 150
#paths for saving weights and finding datasets
top_model_weights_path = 'Inception_fc_model_v0.h5'
train_data_dir = '../data/train2'
validation_data_dir = '../data/train2'
#training related parameters?
inclusive_images = 1424
nb_train_samples = 1424
nb_validation_samples = 1424
epochs = 50
batch_size = 16
def save_bottlebeck_features():
datagen = ImageDataGenerator(rescale=1. / 255)
# build bottleneck features
model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_shape=(img_width,img_height,3))
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
bottleneck_features_train = model.predict_generator(
generator, nb_train_samples // batch_size)
np.save('bottleneck_features_train', bottleneck_features_train)
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
bottleneck_features_validation = model.predict_generator(
generator, nb_validation_samples // batch_size)
np.save('bottleneck_features_validation', bottleneck_features_validation)
def train_top_model():
train_data = np.load('bottleneck_features_train.npy')
train_labels = np.array(range(inclusive_images))
validation_data = np.load('bottleneck_features_validation.npy')
validation_labels = np.array(range(inclusive_images))
print('base size ', train_data.shape[1:])
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(1000, activation='relu'))
model.add(Dense(inclusive_images, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
proceed = True
#model.load_weights(top_model_weights_path)
while proceed:
history = model.fit(train_data, train_labels,
epochs=epochs,
batch_size=batch_size)#,
#validation_data=(validation_data, validation_labels), verbose=1)
if history.history['acc'][-1] > .99:
proceed = False
model.save_weights(top_model_weights_path)
save_bottlebeck_features()
train_top_model()
第50/50集 1424/1424 [==============================]-17s 12ms / step-损失:0.0398-acc:0.9909 >
我还能够在开始时就将此模型堆叠起来以创建我的完整模型,并使用该完整模型成功地对我的训练集进行分类。
from keras import Model
from keras import optimizers
from keras.callbacks import EarlyStopping
img_width, img_height = 150, 150
top_model_weights_path = 'Inception_fc_model_v0.h5'
train_data_dir = '../data/train2'
validation_data_dir = '../data/train2'
#how many inclusive examples do we have?
inclusive_images = 1424
nb_train_samples = 1424
nb_validation_samples = 1424
epochs = 50
batch_size = 16
# build the complete network for evaluation
base_model = applications.inception_v3.InceptionV3(weights='imagenet', include_top=False, input_shape=(img_width,img_height,3))
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(1000, activation='relu'))
top_model.add(Dense(inclusive_images, activation='softmax'))
top_model.load_weights(top_model_weights_path)
#combine base and top model
fullModel = Model(input= base_model.input, output= top_model(base_model.output))
#predict with the full training dataset
results = fullModel.predict_generator(ImageDataGenerator(rescale=1. / 255).flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False))
在此完整模型上处理的结果的检查与生成的完全连接模型的瓶颈的准确性相匹配。
import matplotlib.pyplot as plt
import operator
#retrieve what the softmax based class assignments would be from results
resultMaxClassIDs = [ max(enumerate(result), key=operator.itemgetter(1))[0] for result in results]
#resultMaxClassIDs should be equal to range(inclusive_images) so we subtract the two and plot the log of the absolute value
#looking for spikes that indicate the values aren't equal
plt.plot([np.log(np.abs(x)+10) for x in (np.array(resultMaxClassIDs) - np.array(range(inclusive_images)))])
这是问题所在: 当我采用此完整模型并尝试对其进行训练时,即使验证仍保持在99%以上,准确性也会降至0。
model2 = fullModel
for layer in model2.layers[:-2]:
layer.trainable = False
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
#model.compile(loss='binary_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9), metrics=['accuracy'])
model2.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
train_datagen = ImageDataGenerator(rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
callback = [EarlyStopping(monitor='acc', min_delta=0, patience=3, verbose=0, mode='auto', baseline=None)]
# fine-tune the model
model2.fit_generator(
#train_generator,
validation_generator,
steps_per_epoch=nb_train_samples//batch_size,
validation_steps = nb_validation_samples//batch_size,
epochs=epochs,
validation_data=validation_generator)
历次1/50 89/89 [==============================]-388s 4s / step-损耗:13.5787-acc:0.0000e + 00-val_loss:0.0353-val_acc:0.9937
随着事情的进展,情况变得越来越糟
第21/50集 89/89 [==============================]-372s 4s / step-损耗:7.3850-acc:0.0035-val_loss :0.5813-val_acc:0.8272
我唯一能想到的是,在最后一趟火车上,训练标签被不正确地分配了,但是我之前已经使用VGG16用类似的代码成功地做到了这一点。
我在代码中进行了搜索,试图找出一个差异,以解释为什么模型在99%的时间内进行准确预测会降低其训练精度,同时又在微调过程中保持验证精度,但我无法弄清楚。任何帮助将不胜感激。
有关代码和环境的信息:
有些奇怪的事会出现,但事实就是这样:
我正在使用:
我已签出:
但它们似乎无关。
答案 0 :(得分:7)
注意:由于您的问题有点奇怪,而且如果没有训练有素的模型和数据集就很难调试,因此,在考虑了许多可能会出错的问题之后,这个答案只是(最佳)猜测。请提供您的反馈意见,如果该答案无效,我将删除它。
由于inception_V3包含BatchNormalization
层,因此问题可能是由于将trainable
参数设置为False
({{3} },1,2,3)。
现在,让我们看看这是否是问题的根源:4,在定义模型进行微调时设置学习阶段:
from keras import backend as K
K.set_learning_phase(0)
base_model = applications.inception_v3.InceptionV3(weights='imagenet', include_top=False, input_shape=(img_width,img_height,3))
for layer in base_model.layers:
layer.trainable = False
K.set_learning_phase(1)
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(1000, activation='relu'))
top_model.add(Dense(inclusive_images, activation='softmax'))
top_model.load_weights(top_model_weights_path)
#combine base and top model
fullModel = Model(input= base_model.input, output= top_model(base_model.output))
fullModel.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
#####################################################################
# Here, define the generators and then fit the model same as before #
#####################################################################
侧面说明:这不会对您造成任何问题,但请记住,当您使用top_model(base_model.output)
时,会存储整个顺序模型(即top_model
)作为fullModel
的一层。您可以使用fullModel.summary()
或print(fullModel.layers[-1])
进行验证。因此,当您使用时:
for layer in model2.layers[:-2]:
layer.trainable = False
您实际上也没有冻结base_model
的最后一层。但是,由于它是Concatenate
层,因此没有可训练的参数,因此不会发生任何问题,并且它将按您预期的方式运行。
答案 1 :(得分:1)
和上一个回复一样,我会尝试分享一些想法,看看是否有帮助。
有几件事引起了我的注意(也许值得回顾)。注意:其中一些应该也给您带来了单独模型的问题。
sparse_categorical_crossentropy
,而第二次训练中使用了categorical_crossentropy
。这是对的吗?因为我相信它们假定标签的方式不同(稀疏假定为整数,另一个假定为一)。trainable = True
?我知道您已经将其他设置为trainable = False
,但这也许也值得一试。我希望有帮助。