我正在尝试对2类图像进行分类。尽管我在10个星期后获得了较高的训练和验证准确度(0.97),但我的测试结果却很糟糕(精度为0.48),并且混淆矩阵显示网络正在预测错误类别的图像(附加结果)。
数据集中只有2个类别,每个类别都有10,000个图像示例(增强后)。我正在使用VGG16网络。完整的数据集被分割为20%的测试集(此分割是通过从每个类别中获取随机图像进行的,因此被混洗了)。其余图像分为80%的训练和20%的有效集(如代码的ImageDataGenerator行所示)。所以最后有:
属于2类的12,904张火车图像
3,224个有效图像属于2类
4,032个属于2类的测试图像
这是我的代码:
def CNN(CNN='VGG16', choice='predict', prediction='./dataset/Test/image.jpg'):
''' Train images using one of several CNNs '''
Train = './dataset/Train'
Tests = './dataset/Test'
shape = (224, 224)
epochs = 10
batches = 16
classes = []
for c in os.listdir(Train): classes.append(c)
IDG = keras.preprocessing.image.ImageDataGenerator(validation_split=0.2)
train = IDG.flow_from_directory(Train, target_size=shape, color_mode='rgb',
classes=classes, batch_size=batches, shuffle=True, subset='training')
valid = IDG.flow_from_directory(Train, target_size=shape, color_mode='rgb',
classes=classes, batch_size=batches, shuffle=True, subset='validation')
tests = IDG.flow_from_directory(Tests, target_size=shape, color_mode='rgb',
classes=classes, batch_size=batches, shuffle=True)
input_shape = train.image_shape
if CNN == 'VGG16' or 'vgg16':
model = VGG16(weights=None, input_shape=input_shape,
classes=len(classes))
elif CNN == 'VGG19' or 'vgg19':
model = VGG19(weights=None, input_shape=input_shape,
classes=len(classes))
elif CNN == 'ResNet50' or 'resnet50':
model = ResNet50(weights=None, input_shape=input_shape,
classes=len(classes))
elif CNN == 'DenseNet201' or 'densenet201':
model = DenseNet201(weights=None, input_shape=input_shape,
classes=len(classes))
model.compile(optimizer=keras.optimizers.SGD(
lr=1e-3,
decay=1e-6,
momentum=0.9,
nesterov=True),
loss='categorical_crossentropy',
metrics=['accuracy'])
Esteps = int(train.samples/train.next()[0].shape[0])
Vsteps = int(valid.samples/valid.next()[0].shape[0])
if choice == 'train':
history= model.fit_generator(train,
steps_per_epoch=Esteps,
epochs=epochs,
validation_data=valid,
validation_steps=Vsteps,
verbose=1)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
Y_pred = model.predict_generator(tests, verbose=1)
y_pred = np.argmax(Y_pred, axis=1)
matrix = confusion_matrix(tests.classes, y_pred)
df_cm = pd.DataFrame(matrix, index=classes, columns=classes)
plt.figure(figsize=(10,7))
sn.heatmap(df_cm, annot=True)
print(classification_report(tests.classes,y_pred,target_names=classes))
model.save_weights('weights.h5')
elif choice == 'predict':
model.load_weights('./weights.h5')
img = image.load_img(prediction, target_size=shape)
im = image.img_to_array(img)
im = np.expand_dims(im, axis=0)
if CNN == 'VGG16' or 'vgg16':
im = keras.applications.vgg16.preprocess_input(im)
prediction = model.predict(im)
print(prediction)
elif CNN == 'VGG19' or 'vgg19':
im = keras.applications.vgg19.preprocess_input(im)
prediction = model.predict(im)
print(prediction)
elif CNN == 'ResNet50' or 'resnet50':
im = keras.applications.resnet50.preprocess_input(im)
prediction = model.predict(im)
print(prediction)
print(keras.applications.resnet50.decode_predictions(prediction))
elif CNN == 'DenseNet201' or 'densenet201':
im = keras.applications.densenet201.preprocess_input(im)
prediction = model.predict(im)
print(prediction)
print(keras.applications.densenet201.decode_predictions(prediction))
CNN(CNN='VGG16', choice='train')
结果:
precision recall f1-score support
Predator 0.49 0.49 0.49 2016
Omnivore 0.49 0.49 0.49 2016
accuracy -- -- 0.49 4032
我怀疑ImageDataGenerator()没有在训练/有效分割之前“改组”图像。如果是这种情况,我如何才能强制Keras中的ImageDataGenerator在拆分之前对数据集进行随机播放?
如果不是改组,我该如何解决我的问题?我在做什么错了?
答案 0 :(得分:0)
因此,您的模型基本上是过拟合的,这意味着它正在“存储”您的训练集。我有一些建议:
检查您的2个预测类在训练集中是否均衡。即将50和0和1分开。例如,如果将90%的训练数据标记为0,那么您的模型将简单地将所有内容都预测为0,并在90%的时间内通过验证。
如果您的训练数据已经达到平衡,则表明您的模型尚未推广。也许您可以尝试使用预先训练的模型,而不是自定义训练VGG的每一层?您可以加载VGG的预训练权重,但不包括top和仅训练密集层。
使用交叉验证。重新整理每次验证中的数据,并查看测试集中的结果是否有所改善。