我正在尝试在自己的数据上实现Keras的预训练VGG16 CNN模型,以解决一个简单的二进制图像分类问题。我目前有1184张图像用于训练,而512张图像用于验证-请注意,我只有两个课程。在我的代码中,我首先计算图像的瓶颈特征,然后保存所述特征。然后,该模型使用binary_crossentropy损失函数训练分类器。在大多数情况下,这似乎可以正常工作,但是在随机情况下,该模型似乎陷入了困境。它的列车/行驶损失以及精度都保持恒定不变-精度保持在50%。
我的操作方式有问题吗?还是我只是过度拟合而模型随机卡在本地分钟内而没有下车的机会?这是我的代码:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
from keras.losses import categorical_crossentropy, binary_crossentropy
from keras.optimizers import RMSprop
from keras.utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from keras.applications.imagenet_utils import decode_predictions
from keras.applications.vgg16 import preprocess_input
img_width, img_height = 256, 256
train_count = 1184
validation_count = 512
batch_size = 32
epochs = 50
top_model_weights_path = 'bottleneck_fc_model.h5'
learnrate=10e-4
train_data_dir = 'path/to/train'
validation_data_dir = 'path/to/validation'
# ----TRAIN----
model = applications.VGG16(include_top=False, weights='imagenet')
datagen = ImageDataGenerator(rescale=1. / 255, preprocessing_function = preprocess_input)
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_train = model.predict_generator(generator, train_count // batch_size)
np.save('bottleneck_features_train.npy', bottleneck_features_train)
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
datagen = ImageDataGenerator(rescale=1./255, preprocessing_function = preprocess_input)
bottleneck_features_validation = model.predict_generator(generator, validation_count // batch_size)
np.save('bottleneck_features_validation.npy', bottleneck_features_validation)
# ----TEST----
train_data = np.load('bottleneck_features_train.npy')
train_labels = np.array([0] * int(train_count/2) + [1] * int(train_count/2))
validation_data = np.load('bottleneck_features_validation.npy')
validation_labels = np.array([0] * 256 + [1] * 256)
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
rms = RMSprop(lr=learnrate)
model.compile(optimizer=rms, loss='binary_crossentropy', metrics=['accuracy'])
hist = model.fit(train_data, train_labels, verbose = 1,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data, validation_labels))
model.save_weights(top_model_weights_path)
以下是卡住时的一些输出:
Epoch 1/50
1184/1184 [==============================] - 5s 4ms/step - loss: 7.8217 - acc: 0.4949 - val_loss: 7.9712 - val_acc: 0.5000
Epoch 2/50
1184/1184 [==============================] - 4s 4ms/step - loss: 7.9712 - acc: 0.5000 - val_loss: 7.9712 - val_acc: 0.5000
Epoch 3/50
1184/1184 [==============================] - 4s 4ms/step - loss: 7.9712 - acc: 0.5000 - val_loss: 7.9712 - val_acc: 0.5000
Epoch 4/50
1184/1184 [==============================] - 4s 4ms/step - loss: 7.9712 - acc: 0.5000 - val_loss: 7.9712 - val_acc: 0.5000
Epoch 5/50
1184/1184 [==============================] - 4s 4ms/step - loss: 7.9712 - acc: 0.5000 - val_loss: 7.9712 - val_acc: 0.5000