我有一个关于在Keras中使用数据扩充进行特征提取的问题。我正在建立一个狗品种分类器。
通过特征提取,我指的是通过在顶部添加密集层来扩展模型(conv_base,VGG16),并在输入数据上端到端地运行整个事物。这将允许我使用数据增强,因为每次输入图像都会在模型看到时通过卷积基础。
训练集:属于133个班级的6680张图片
验证集:属于133个类的835张图片
测试集:属于133个类的836个图像
我能够成功地相互独立地实现数据增强和特征提取,但是当我尝试组合2时,由于某种原因,我的准确度非常小。为什么是这样?我的做法是否做了一些重大错误?
from keras.applications import VGG16
from keras.preprocessing.image import ImageDataGenerator
conv_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(224, 224, 3))
model = Sequential()
model.add(conv_base)
conv_base.trainable = False
model.add(GlobalAveragePooling2D())
model.add(Dense(133, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
train_datagen_aug = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
test_datagen_aug = ImageDataGenerator(rescale=1./255)
train_generator_aug = train_datagen_aug.flow_from_directory(
'myImages/train',
target_size=(224, 224),
batch_size=50,
class_mode='categorical')
validation_generator_aug = test_datagen_aug.flow_from_directory(
'myImages/valid',
target_size=(224, 224),
batch_size=32,
class_mode='categorical')
checkpointer_aug = ModelCheckpoint(filepath='saved_models/dogs_transfer_aug_model.h5',
save_best_only=True)
history = model.fit_generator(
train_generator_aug,
steps_per_epoch=130,
epochs=20,
validation_data=validation_generator_aug,
verbose=1,
callbacks=[checkpointer_aug],
validation_steps=26)
输出如下:
Epoch 1/20
130/130 [==============================] - 293s - loss: 15.9044 - acc: 0.0083 - val_loss: 16.0019 - val_acc: 0.0072
Epoch 2/20
130/130 [==============================] - 281s - loss: 15.9972 - acc: 0.0075 - val_loss: 15.9977 - val_acc: 0.0075
Epoch 3/20
130/130 [==============================] - 280s - loss: 16.0220 - acc: 0.0060 - val_loss: 15.9977 - val_acc: 0.0075
Epoch 4/20
130/130 [==============================] - 280s - loss: 15.9941 - acc: 0.0077 - val_loss: 16.0019 - val_acc: 0.0072
答案 0 :(得分:0)
我建议它是模型而不是拟合问题,如模型的损失和准确性所示。我们可以尝试使用较小版本(减少层数)的VGG16
from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout
from keras.layers.convolutional import MaxPooling2D, ZeroPadding2D
NUMBER_OF_TRAINING_SAMPLES = 6668
NUMBER_OF_VALIDATION_SAMPLES = 835 # let's say you have 400 dogs and 400 cats
batch_size = 32
out_classes = 133
input_shape=(224, 224, 3)
def buildSmallVGG(out_classes, input_shape):
model = Sequential()
model.add(ZeroPadding2D((1,1),input_shape=input_shape))
model.add(Conv2D(16, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Conv2D(32, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Conv2D(64, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(out_classes, activation='softmax'))
return model
model = buildSmallVGG(out_classes, input_shape)
history = model.fit_generator(
train_generator_aug,
steps_per_epoch=NUMBER_OF_TRAINING_SAMPLES // batch_size,
epochs=20,
validation_data=validation_generator_aug,
callbacks=[checkpointer_aug],
validation_steps=NUMBER_OF_VALIDATION_SAMPLES // batch_size)
以上内容未经测试。如果您可以就损失,准确性等方面分享结果,那会很好。