我已经训练了VGG作为100个时代的10级分类器,这就是训练/验证的准确性。
此外,我想在保持测试集上测试模型,因此我对它进行了如下评估:
test_datagen = ImageDataGenerator(
rescale=1./255,
)
test_generator = test_datagen.flow_from_directory(
'/content/drive/My Drive/Colab Notebooks/domat/solo-dataset/test/',
target_size=(224, 224),
batch_size=32,
class_mode='categorical',
shuffle=False
)
steps = 3616 // 32
loss, accuracy = model_vgg_imagenet_dropout.evaluate_generator(test_generator,
steps = steps,
workers = 4,
use_multiprocessing=True)
当我打印结果时,我得到(1.4021655139801776,0.802820796460177),这与我的预期相似。
但是,当我尝试通过model.predict_generator对其进行手动评估时,我只能获得 13 %的准确度。
以下是用于手动评估它的代码(生成器是同一对象):
predictions = model_vgg_imagenet_dropout.predict_generator(test_generator,
steps = steps,
workers = 4,
use_multiprocessing=True)
y_pred = np.zeros(len(predictions))
for i, p in enumerate(predictions):
max_index = np.argmax(p)
y_pred[i] = max_index
# the y_pred array should contain the class index of each sample, as defined by test_generator.class_indices
y_true = test_generator.classes
from sklearn.metrics import accuracy_score
print(accuracy_score(y_true, y_pred))
我不知道自己在哪里犯错,这对我来说似乎是正确的。
编辑:当我手动观察来自model.predict_generator()的结果并将softmax值映射到类索引时,它在大多数情况下实际上输出3或4个类。