我正在尝试实现MaxUp方法(https://arxiv.org/pdf/2002.09024v1.pdf),以改善图像分类器的通用性。 据我了解,该方法的本质在于,对于数据集中的每个数据点,我们都会生成一小组扰动或扩充的数据点(在我的情况下,我使用了扩充),并得到一批 m < / em>增强图像。然后,我们将该批次传递给神经网络,并计算批次中每个示例的损失。之后,我们仅使用最大损耗来优化网络。我们基本上将最大损失降到最低。
我正在使用预先平衡的Imagenette数据集(https://github.com/fastai/imagenette)。
因此,我构建了一个由3个VGG块组成的简单convnet。
def define_model_VGG(loss):
momentum = 0.9
#VGG
model = Sequential()
model.add(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3)))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(MaxPooling2D(2, 2))
model.add(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(MaxPooling2D(2, 2))
model.add(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(MaxPooling2D(2, 2))
model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(MaxPooling2D(2, 2))
model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
model.add(MaxPooling2D(2, 2))
model.add(Flatten())
model.add(Dense(512, activation = 'relu', kernel_initializer = 'he_normal'))
model.add(Dense(256, activation = 'relu', kernel_initializer = 'he_normal'))
model.add(Dense(10, activation = 'softmax'))
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss=loss, metrics=['accuracy'])
return model
我定义了一个自定义损失函数,以便它将返回最大损失而不是平均损失。
def max_loss_minimization(y_true, y_pred):
"""
Returns max loss
"""
loss = tf.keras.losses.categorical_crossentropy(y_true, y_pred)
max_loss = tf.keras.backend.max(loss)
return max_loss
我制作了一个生成器,可以基于一个图像返回任意数量的增强图像。
def augmentation_generator(aug_batch_size=AUG_BATCH_SIZE):
"""
Returns aug_batch_size augmented images of one data point
"""
maxup_datagen = ImageDataGenerator(rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
train_datagen = ImageDataGenerator(rescale=1./255)
maxup_train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(IMAGE_SIZE, IMAGE_SIZE),
shuffle=True,
batch_size=1,
class_mode='categorical')
while True:
next_image, next_label = next(maxup_train_generator)
i = 0
X = np.zeros((aug_batch_size, IMAGE_SIZE, IMAGE_SIZE, 3))
y = np.zeros((aug_batch_size, 10))
for x_batch, y_batch in maxup_datagen.flow(next_image, next_label, batch_size=1):
X[i,:,:,:] = x_batch
y[i,:] = y_batch
i += 1
if i == aug_batch_size:
break
yield X, y
在训练过程中,我生成了一批增强图像,进行前向传递,计算最大损失并通过应用渐变来更新模型。
test_datagen = ImageDataGenerator(rescale=1./255)
test_gen = test_datagen.flow_from_directory(
test_dir,
target_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=TEST_BATCH_SIZE,
shuffle=True,
class_mode='categorical'
)
val_data, val_labels = next(test_gen)
val_labels = np.array([np.where(a==1)[0][0] for a in val_labels]) # validation data
model_maxup = define_model_VGG(loss=max_loss_minimization)
epochs = 20
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch,))
aug_generator = augmentation_generator()
# Iterate over the batches of the dataset.
for step in range(0, IMAGE_NUMBER):
x_batch_train, y_batch_train = next(aug_generator)
x_batch_train = tf.convert_to_tensor(x_batch_train, dtype='float32')
y_batch_train = tf.convert_to_tensor(y_batch_train, dtype='float32')
# Open a GradientTape to record the operations run
# during the forward pass, which enables autodifferentiation.
with tf.GradientTape() as tape:
# Run the forward pass of the layer.
# The operations that the layer applies
# to its inputs are going to be recorded
# on the GradientTape.
logits = model_maxup(x_batch_train) # Logits for this minibatch
# Compute the loss value for this minibatch.
loss_value = max_loss_minimization(y_batch_train, logits)
# Use the gradient tape to automatically retrieve
# the gradients of the trainable variables with respect to the loss.
grads = tape.gradient(loss_value, model_maxup.trainable_weights)
# Run one step of gradient descent by updating
# the value of the variables to minimize the loss.
opt.apply_gradients(zip(grads, model_maxup.trainable_weights))
# Log every 200 batches.
if step % 200 == 0:
print(
"Training loss (for one batch) at step %d: %.4f"
% (step, float(loss_value))
)
print("Seen so far: %s samples" % ((step + 1)))
if step % 1000 == 0:
pred_labels = np.argmax(model_maxup.predict(val_data),axis=1)
print("Validation accuracy: {}".format(sum(val_labels==pred_labels)/TEST_BATCH_SIZE))
但是,我的神经网络似乎根本没有训练。我得到以下输出:
Start of epoch 0
Found 9300 images belonging to 10 classes.
Training loss (for one batch) at step 0: 2.3029
Seen so far: 1 samples
Validation accuracy: 0.0975
Training loss (for one batch) at step 200: 2.4079
Seen so far: 201 samples
Training loss (for one batch) at step 400: 2.3045
Seen so far: 401 samples
Training loss (for one batch) at step 600: 2.1799
Seen so far: 601 samples
Training loss (for one batch) at step 800: 2.3446
Seen so far: 801 samples
Training loss (for one batch) at step 1000: 2.3806
Seen so far: 1001 samples
Validation accuracy: 0.085
Training loss (for one batch) at step 1200: 2.2879
Seen so far: 1201 samples
Training loss (for one batch) at step 1400: 2.3160
Seen so far: 1401 samples
我还尝试了高级model.fit方法,并获得了相同的结果。 该问题很可能是由于我的自定义丢失或数据生成器造成的。 你能告诉我我做错了什么吗?我很确定有一种方法可以使它更容易。
谢谢。
UPD: 我尝试了随机的高斯扰动而不是增强,它似乎并没有太大变化。 另外,当我继续研究该问题时,我注意到当我用“ reduce_mean”代替“ reduce_max”操作时,训练精度以及验证精度都开始提高。但是,该论文指出,它们以某种方式使最大损失最小化。 还有其他使最大损失最小化的方法吗?