Question

我正在尝试实现MaxUp方法（https://arxiv.org/pdf/2002.09024v1.pdf），以改善图像分类器的通用性。据我了解，该方法的本质在于，对于数据集中的每个数据点，我们都会生成一小组扰动或扩充的数据点（在我的情况下，我使用了扩充），并得到一批 m < / em>增强图像。然后，我们将该批次传递给神经网络，并计算批次中每个示例的损失。之后，我们仅使用最大损耗来优化网络。我们基本上将最大损失降到最低。

我正在使用预先平衡的Imagenette数据集（https://github.com/fastai/imagenette）。

因此，我构建了一个由3个VGG块组成的简单convnet。

def define_model_VGG(loss): momentum = 0.9 #VGG model = Sequential() model.add(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3))) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(MaxPooling2D(2, 2)) model.add(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(MaxPooling2D(2, 2)) model.add(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(MaxPooling2D(2, 2)) model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(MaxPooling2D(2, 2)) model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')) model.add(BatchNormalization(momentum = momentum, center=True, scale=False)) model.add(MaxPooling2D(2, 2)) model.add(Flatten()) model.add(Dense(512, activation = 'relu', kernel_initializer = 'he_normal')) model.add(Dense(256, activation = 'relu', kernel_initializer = 'he_normal')) model.add(Dense(10, activation = 'softmax')) opt = SGD(lr=0.001, momentum=0.9) model.compile(optimizer=opt, loss=loss, metrics=['accuracy']) return model

我定义了一个自定义损失函数，以便它将返回最大损失而不是平均损失。

def max_loss_minimization(y_true, y_pred): """ Returns max loss """ loss = tf.keras.losses.categorical_crossentropy(y_true, y_pred) max_loss = tf.keras.backend.max(loss) return max_loss

我制作了一个生成器，可以基于一个图像返回任意数量的增强图像。

def augmentation_generator(aug_batch_size=AUG_BATCH_SIZE): """ Returns aug_batch_size augmented images of one data point """ maxup_datagen = ImageDataGenerator(rotation_range=30, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) train_datagen = ImageDataGenerator(rescale=1./255) maxup_train_generator = train_datagen.flow_from_directory( train_dir, target_size=(IMAGE_SIZE, IMAGE_SIZE), shuffle=True, batch_size=1, class_mode='categorical') while True: next_image, next_label = next(maxup_train_generator) i = 0 X = np.zeros((aug_batch_size, IMAGE_SIZE, IMAGE_SIZE, 3)) y = np.zeros((aug_batch_size, 10)) for x_batch, y_batch in maxup_datagen.flow(next_image, next_label, batch_size=1): X[i,:,:,:] = x_batch y[i,:] = y_batch i += 1 if i == aug_batch_size: break yield X, y

在训练过程中，我生成了一批增强图像，进行前向传递，计算最大损失并通过应用渐变来更新模型。

test_datagen = ImageDataGenerator(rescale=1./255) test_gen = test_datagen.flow_from_directory( test_dir, target_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=TEST_BATCH_SIZE, shuffle=True, class_mode='categorical' ) val_data, val_labels = next(test_gen) val_labels = np.array([np.where(a==1)[0][0] for a in val_labels]) # validation data model_maxup = define_model_VGG(loss=max_loss_minimization) epochs = 20 for epoch in range(epochs): print("\nStart of epoch %d" % (epoch,)) aug_generator = augmentation_generator() # Iterate over the batches of the dataset. for step in range(0, IMAGE_NUMBER): x_batch_train, y_batch_train = next(aug_generator) x_batch_train = tf.convert_to_tensor(x_batch_train, dtype='float32') y_batch_train = tf.convert_to_tensor(y_batch_train, dtype='float32') # Open a GradientTape to record the operations run # during the forward pass, which enables autodifferentiation. with tf.GradientTape() as tape: # Run the forward pass of the layer. # The operations that the layer applies # to its inputs are going to be recorded # on the GradientTape. logits = model_maxup(x_batch_train) # Logits for this minibatch # Compute the loss value for this minibatch. loss_value = max_loss_minimization(y_batch_train, logits) # Use the gradient tape to automatically retrieve # the gradients of the trainable variables with respect to the loss. grads = tape.gradient(loss_value, model_maxup.trainable_weights) # Run one step of gradient descent by updating # the value of the variables to minimize the loss. opt.apply_gradients(zip(grads, model_maxup.trainable_weights)) # Log every 200 batches. if step % 200 == 0: print( "Training loss (for one batch) at step %d: %.4f" % (step, float(loss_value)) ) print("Seen so far: %s samples" % ((step + 1))) if step % 1000 == 0: pred_labels = np.argmax(model_maxup.predict(val_data),axis=1) print("Validation accuracy: {}".format(sum(val_labels==pred_labels)/TEST_BATCH_SIZE))

但是，我的神经网络似乎根本没有训练。我得到以下输出：

Start of epoch 0 Found 9300 images belonging to 10 classes. Training loss (for one batch) at step 0: 2.3029 Seen so far: 1 samples Validation accuracy: 0.0975 Training loss (for one batch) at step 200: 2.4079 Seen so far: 201 samples Training loss (for one batch) at step 400: 2.3045 Seen so far: 401 samples Training loss (for one batch) at step 600: 2.1799 Seen so far: 601 samples Training loss (for one batch) at step 800: 2.3446 Seen so far: 801 samples Training loss (for one batch) at step 1000: 2.3806 Seen so far: 1001 samples Validation accuracy: 0.085 Training loss (for one batch) at step 1200: 2.2879 Seen so far: 1201 samples Training loss (for one batch) at step 1400: 2.3160 Seen so far: 1401 samples

我还尝试了高级model.fit方法，并获得了相同的结果。该问题很可能是由于我的自定义丢失或数据生成器造成的。你能告诉我我做错了什么吗？我很确定有一种方法可以使它更容易。

谢谢。

UPD：我尝试了随机的高斯扰动而不是增强，它似乎并没有太大变化。另外，当我继续研究该问题时，我注意到当我用“ reduce_mean”代替“ reduce_max”操作时，训练精度以及验证精度都开始提高。但是，该论文指出，它们以某种方式使最大损失最小化。还有其他使最大损失最小化的方法吗？

最小化一批扩充数据中的最大损失

0 个答案: