我正在使用Keras 2.0.4(TensorFlow后端)进行图像分类任务。
我正在尝试训练自己的网络(没有任何预先训练的参数)。
由于我的数据很大,我无法将所有数据加载到内存中。
因此,我使用ImageDataGenerator()
,flow_from_directory()
和fit_generator()
。
创建ImageDataGenerator
对象:
train_datagen = ImageDataGenerator(preprocessing_function = my_preprocessing_function) # only preprocessing; no augmentation; static data set
my_preprocessing_function将图像重新缩放到域[0,255],并通过平均缩减来中心数据(类似于VGG16或VGG19的预处理)
使用flow_from_directory()
对象中的方法ImageDataGenerator
:
train_generator = train_datagen.flow_from_directory(
path/to/training/directory/with/five/subfolders,
target_size=(img_width, img_height),
batch_size=64,
classes = ['class1', 'class2', 'class3', 'class4', 'class5'],
shuffle = True,
seed = 1337,
class_mode='categorical')
(为了创建validation_generator,也是如此。)
在定义和编译模型(损失函数:categorical crossentropy
,优化器:Adam
)之后,我使用fit_generator()
训练模型:
model.fit_generator(
train_generator,
steps_per_epoch=total_amount_of_train_samples/batch_size,
epochs=400,
validation_data=validation_generator,
validation_steps=total_amount_of_validation_samples/batch_size)
问题:
没有错误消息,但培训效果不佳。
在400个时期之后,准确度仍然在20%左右振荡(这与随机选择其中一个类别一样好)。实际上,分类器总是预测'class1'。
仅在一个训练时期之后也是如此。虽然我正在初始化随机权重,为什么会这样?
怎么了?我错过了什么?
U S E D M O D E L
x = Input(shape=input_shape)
# Block 1
x = Conv2D(16, (3, 3), activation='relu', padding='same', name='block1_conv1')(x)
x = Conv2D(16, (5, 5), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(64, (5, 5), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(16, (1, 1), activation='relu', padding='same', name='block3_conv1')(x)
# Block 4
x = Conv2D(256, (3, 3), activation='relu', padding='valid', name='block4_conv1')(x)
x = Conv2D(256, (5, 5), activation='relu', padding='valid', name='block4_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(1024, (3, 3), activation='relu', padding='valid', name='block5_conv1')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
# topping
x = Dense(1024, activation='relu', name='fc1')(x)
x = Dense(1024, activation='relu', name='fc2')(x)
predictions = Dense(5, activation='softmax', name='predictions')(x)