我正在努力解决一个看似简单的问题。我无法弄清楚如何将输入图像与我的模型产生的概率相匹配。
我的模型的培训和验证(香草VGG16
,重新培训了2个班级,狗和猫)都很顺利,让我接近97%的验证准确度,但当我运行检查看看是什么我做对了,我错了,我只得到随机结果。
找到1087个正确的标签(53.08%)
我很确定它与我在验证图像上生成随机批次的ImageDataGenerator
有关,尽管我设置了shuffle = false
我只是在运行它之前保存了我的生成器的文件名和类,并且我认为我的文件名和类的索引与我的概率输出相同。
这是我的设置(Vanilla VGG16,最后一层被替换为匹配猫和狗的两个类别)
new_model.summary()
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
Binary_predictions (Dense) (None, 2) 8194
=================================================================
Total params: 134,268,738
Trainable params: 8,194
Non-trainable params: 134,260,544
_________________________________________________________________
batch_size=16
epochs=3
learning_rate=0.01
这是发电机的定义,用于培训和验证。此时我还没有包括数据增强部分。
train_datagen = ImageDataGenerator()
validation_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
train_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
train_filenames = train_generator.filenames
train_samples = len(train_filenames)
validation_generator = validation_datagen.flow_from_directory(
valid_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
shuffle = False) #Need this to be false, so I can extract the correct classes and filenames in order that that are predicted
validation_filenames = validation_generator.filenames
validation_samples = len(validation_filenames)
Finetuning模型很好
#Fine-tune the model
#DOC: fit_generator(generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None,
# validation_data=None, validation_steps=None, class_weight=None,
# max_queue_size=10, workers=1, use_multiprocessing=False, initial_epoch=0)
new_model.fit_generator(
train_generator,
steps_per_epoch=train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=validation_samples // batch_size)
Epoch 1/3
1434/1434 [==============================] - 146s - loss: 0.5456 - acc: 0.9653 - val_loss: 0.5043 - val_acc: 0.9678
Epoch 2/3
1434/1434 [==============================] - 148s - loss: 0.5312 - acc: 0.9665 - val_loss: 0.4293 - val_acc: 0.9722
Epoch 3/3
1434/1434 [==============================] - 148s - loss: 0.5332 - acc: 0.9665 - val_loss: 0.4329 - val_acc: 0.9731
提取验证数据
#We need the probabilities/scores for the validation set
#DOC: predict_generator(generator, steps, max_queue_size=10, workers=1,
# use_multiprocessing=False, verbose=0)
probs = new_model.predict_generator(
validation_generator,
steps=validation_samples // batch_size,
verbose = 1)
#Extracting the probabilities and labels
our_predictions = probs[:,0]
our_labels = np.round(1-our_predictions)
expected_labels = validation_generator.classes
现在,当我通过比较预期标签和计算出的标签来计算验证集的成功时,我得到的东西可疑地接近随机:
correct = np.where(our_labels==expected_labels)[0]
print("Found {:3d} correct labels ({:.2f}%)".format(len(correct),
100*len(correct)/len(our_predictions)))
找到1087个正确的标签(53.08%)
显然这不正确。
我怀疑这与Generators的随机性有关,但是我设置了shuffle = False。
这段代码是由伟大的杰里米·霍华德(Jeremy Howard)直接从Fast.ai课程中复制而来的,但是我不能让它继续工作..
我在Anaconda的Python 3.5上使用Keras 2.0.8和TensorFlow 1.3后端......
请帮助我保持理智!
答案 0 :(得分:1)
您需要在validation_generator.reset()
和fit_generator()
之间致电predict_generator()
。
在*_generator()
函数中,数据批处理在用于拟合/评估模型之前插入到队列中。底层队列始终保持满,因此在训练结束时队列中会有一些额外的批次。您可以在培训后打印validation_generator.batch_index
进行验证。因此,您的predict_generator()
不会从第一批开始,probs[0]
不是第一张图片的预测。这就是为什么our_labels
与expected_labels
不一致且准确性很低的原因。
validation_steps=validation_samples // batch_size + 1
(也用于训练生成器)。除非validation_samples
是batch_size
的倍数,否则如果您使用validation_steps=validation_samples // batch_size
,则忽略每个时期中的一个批次,并且您的模型将在每个时期的(稍微)不同的数据集上进行评估。
答案 1 :(得分:0)
之前我遇到过类似的问题,我认为predict_generator()
不友好,所以我写了一个函数来测试数据集。
这是我的代码片段:
from PIL import Image
import numpy as np
import json
def get_img_result(img_path):
image = Image.open(img_path)
image.load()
image = image.resize((img_width, img_height))
if image.mode is not 'RGB':
image = image.convert('RGB')
array = np.asarray(image, dtype='int32')
array = array / 255
array = np.asarray([array])
result = new_model.predict(array)
print(result)
return result
# path: the root folder of the validation data set. validation->cat->kitty.jpg
def validate(path):
result_list = []
right_count = 0
wrong_count = 0
categories = os.listdir(path)
for i in range(len(categories)):
images = os.listdir(os.path.join(path, categories[i]))
for image in images:
result = get_img_result(os.path.join(path, categories[i], image))[0]
if result[i] != max(result):
result_list.append({'image': image, 'category': categories[i], 'score': result.tolist(), 'right': 0})
wrong_count = wrong_count + 1
else:
result_list.append({'image': image, 'category': categories[i], 'score': result.tolist(), 'right': 1})
right_count = right_count + 1
json_string = json.dumps(result_list)
with open('result.json', 'w') as f:
f.write(json_string)
print('right count : {0} \n wrong count : {1} \n accuracy : {2}'.format(right_count, wrong_count,
(right_count) / (
right_count + wrong_count)))
我使用PIL
将图像转换为像Keras那样的numpy数组,我测试所有图像并将结果保存到json文件中。