Keras中使用数据增强进行特征提取

时间:2018-02-27 17:15:20

标签: neural-network keras conv-neural-network feature-extraction

我有一个关于在Keras中使用数据扩充进行特征提取的问题。我正在建立一个狗品种分类器。

通过特征提取,我指的是通过在顶部添加密集层来扩展模型(conv_base,VGG16),并在输入数据上端到端地运行整个事物。这将允许我使用数据增强,因为每次输入图像都会在模型看到时通过卷积基础。

训练集:属于133个班级的6680张图片

验证集:属于133个类的835张图片

测试集:属于133个类的836个图像

我能够成功地相互独立地实现数据增强和特征提取,但是当我尝试组合2时,由于某种原因,我的准确度非常小。为什么是这样?我的做法是否做了一些重大错误?

from keras.applications import VGG16

from keras.preprocessing.image import ImageDataGenerator

conv_base = VGG16(weights='imagenet',
                  include_top=False,
                  input_shape=(224, 224, 3))

model = Sequential()

model.add(conv_base)

conv_base.trainable = False

model.add(GlobalAveragePooling2D())

model.add(Dense(133, activation='softmax'))

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

train_datagen_aug = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)

test_datagen_aug = ImageDataGenerator(rescale=1./255)

train_generator_aug = train_datagen_aug.flow_from_directory(
    'myImages/train',
    target_size=(224, 224),
    batch_size=50,
    class_mode='categorical')

validation_generator_aug = test_datagen_aug.flow_from_directory(
        'myImages/valid',
        target_size=(224, 224),
        batch_size=32,
        class_mode='categorical')

checkpointer_aug = ModelCheckpoint(filepath='saved_models/dogs_transfer_aug_model.h5', 
                            save_best_only=True)

history = model.fit_generator(
      train_generator_aug,
      steps_per_epoch=130,
      epochs=20,
      validation_data=validation_generator_aug,
      verbose=1,
      callbacks=[checkpointer_aug],
      validation_steps=26)

输出如下:

Epoch 1/20
130/130 [==============================] - 293s - loss: 15.9044 - acc: 0.0083 - val_loss: 16.0019 - val_acc: 0.0072
Epoch 2/20
130/130 [==============================] - 281s - loss: 15.9972 - acc: 0.0075 - val_loss: 15.9977 - val_acc: 0.0075
Epoch 3/20
130/130 [==============================] - 280s - loss: 16.0220 - acc: 0.0060 - val_loss: 15.9977 - val_acc: 0.0075
Epoch 4/20
130/130 [==============================] - 280s - loss: 15.9941 - acc: 0.0077 - val_loss: 16.0019 - val_acc: 0.0072

1 个答案:

答案 0 :(得分:0)

我建议它是模型而不是拟合问题,如模型的损失和准确性所示。我们可以尝试使用较小版本(减少层数)的VGG16

from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout
from keras.layers.convolutional import MaxPooling2D, ZeroPadding2D

NUMBER_OF_TRAINING_SAMPLES = 6668 
NUMBER_OF_VALIDATION_SAMPLES = 835 # let's say you have 400 dogs and 400 cats
batch_size = 32
out_classes = 133
input_shape=(224, 224, 3)

def buildSmallVGG(out_classes, input_shape):
    model = Sequential()
    model.add(ZeroPadding2D((1,1),input_shape=input_shape))
    model.add(Conv2D(16, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(32, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))


    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(64, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(Flatten())
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))

    model.add(Dense(out_classes, activation='softmax'))
    return model

model = buildSmallVGG(out_classes, input_shape)

history = model.fit_generator(
      train_generator_aug,
      steps_per_epoch=NUMBER_OF_TRAINING_SAMPLES // batch_size,
      epochs=20,
      validation_data=validation_generator_aug,
      callbacks=[checkpointer_aug],
      validation_steps=NUMBER_OF_VALIDATION_SAMPLES // batch_size)

以上内容未经测试。如果您可以就损失,准确性等方面分享结果,那会很好。