最小化一批扩充数据中的最大损失

时间:2020-07-12 13:59:42

标签: python tensorflow machine-learning keras neural-network

我正在尝试实现MaxUp方法(https://arxiv.org/pdf/2002.09024v1.pdf),以改善图像分类器的通用性。 据我了解,该方法的本质在于,对于数据集中的每个数据点,我们都会生成一小组扰动或扩充的数据点(在我的情况下,我使用了扩充),并得到一批 m < / em>增强图像。然后,我们将该批次传递给神经网络,并计算批次中每个示例的损失。之后,我们仅使用最大损耗来优化网络。我们基本上将最大损失降到最低。

我正在使用预先平衡的Imagenette数据集(https://github.com/fastai/imagenette)。

因此,我构建了一个由3个VGG块组成的简单convnet。

def define_model_VGG(loss):
    momentum = 0.9
    #VGG
    model = Sequential()
    model.add(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3)))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(MaxPooling2D(2, 2))

    model.add(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(MaxPooling2D(2, 2))

    model.add(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(MaxPooling2D(2, 2))

    model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(MaxPooling2D(2, 2))

    model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'))
    model.add(BatchNormalization(momentum = momentum, center=True, scale=False))
    model.add(MaxPooling2D(2, 2))

    model.add(Flatten())
    model.add(Dense(512, activation = 'relu', kernel_initializer = 'he_normal'))
    model.add(Dense(256, activation = 'relu', kernel_initializer = 'he_normal'))
    model.add(Dense(10, activation = 'softmax'))

    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss=loss, metrics=['accuracy'])
    return model

我定义了一个自定义损失函数,以便它将返回最大损失而不是平均损失。

def max_loss_minimization(y_true, y_pred):
    """
    Returns max loss
    """
    loss = tf.keras.losses.categorical_crossentropy(y_true, y_pred)
    max_loss = tf.keras.backend.max(loss)
    return max_loss

我制作了一个生成器,可以基于一个图像返回任意数量的增强图像。

def augmentation_generator(aug_batch_size=AUG_BATCH_SIZE):
    """
    Returns aug_batch_size augmented images of one data point
    """
    maxup_datagen = ImageDataGenerator(rotation_range=30,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True)

    train_datagen = ImageDataGenerator(rescale=1./255)
    maxup_train_generator = train_datagen.flow_from_directory(
            train_dir,
            target_size=(IMAGE_SIZE, IMAGE_SIZE),
            shuffle=True,
            batch_size=1,
            class_mode='categorical')

    while True:
        
        next_image, next_label = next(maxup_train_generator)
        i = 0
        X = np.zeros((aug_batch_size, IMAGE_SIZE, IMAGE_SIZE, 3))
        y = np.zeros((aug_batch_size, 10))

        for x_batch, y_batch in maxup_datagen.flow(next_image, next_label, batch_size=1):
            X[i,:,:,:] = x_batch
            y[i,:] = y_batch
            i += 1
            if i == aug_batch_size:
                break

        yield X, y

在训练过程中,我生成了一批增强图像,进行前向传递,计算最大损失并通过应用渐变来更新模型。

test_datagen = ImageDataGenerator(rescale=1./255)
test_gen = test_datagen.flow_from_directory(
    test_dir,
    target_size=(IMAGE_SIZE, IMAGE_SIZE),
    batch_size=TEST_BATCH_SIZE,
    shuffle=True,
    class_mode='categorical'
)

val_data, val_labels = next(test_gen)
val_labels = np.array([np.where(a==1)[0][0] for a in val_labels]) # validation data

model_maxup = define_model_VGG(loss=max_loss_minimization)

epochs = 20
for epoch in range(epochs):
    print("\nStart of epoch %d" % (epoch,))
    
    aug_generator = augmentation_generator()

    # Iterate over the batches of the dataset.
    for step in range(0, IMAGE_NUMBER):
        
        x_batch_train, y_batch_train = next(aug_generator)
        x_batch_train = tf.convert_to_tensor(x_batch_train, dtype='float32')
        y_batch_train = tf.convert_to_tensor(y_batch_train, dtype='float32')
        # Open a GradientTape to record the operations run
        # during the forward pass, which enables autodifferentiation.
        with tf.GradientTape() as tape:

            # Run the forward pass of the layer.
            # The operations that the layer applies
            # to its inputs are going to be recorded
            # on the GradientTape.
            logits = model_maxup(x_batch_train)  # Logits for this minibatch

            # Compute the loss value for this minibatch.
            loss_value = max_loss_minimization(y_batch_train, logits)

        # Use the gradient tape to automatically retrieve
        # the gradients of the trainable variables with respect to the loss.
        grads = tape.gradient(loss_value, model_maxup.trainable_weights)

        # Run one step of gradient descent by updating
        # the value of the variables to minimize the loss.
        opt.apply_gradients(zip(grads, model_maxup.trainable_weights))

        # Log every 200 batches.
        if step % 200 == 0:
            print(
                "Training loss (for one batch) at step %d: %.4f"
                % (step, float(loss_value))
            )
            print("Seen so far: %s samples" % ((step + 1)))
            
        if step % 1000 == 0:
            pred_labels = np.argmax(model_maxup.predict(val_data),axis=1)
            print("Validation accuracy: {}".format(sum(val_labels==pred_labels)/TEST_BATCH_SIZE))

但是,我的神经网络似乎根本没有训练。我得到以下输出:

Start of epoch 0
Found 9300 images belonging to 10 classes.
Training loss (for one batch) at step 0: 2.3029
Seen so far: 1 samples
Validation accuracy: 0.0975
Training loss (for one batch) at step 200: 2.4079
Seen so far: 201 samples
Training loss (for one batch) at step 400: 2.3045
Seen so far: 401 samples
Training loss (for one batch) at step 600: 2.1799
Seen so far: 601 samples
Training loss (for one batch) at step 800: 2.3446
Seen so far: 801 samples
Training loss (for one batch) at step 1000: 2.3806
Seen so far: 1001 samples
Validation accuracy: 0.085
Training loss (for one batch) at step 1200: 2.2879
Seen so far: 1201 samples
Training loss (for one batch) at step 1400: 2.3160
Seen so far: 1401 samples

我还尝试了高级model.fit方法,并获得了相同的结果。 该问题很可能是由于我的自定义丢失或数据生成器造成的。 你能告诉我我做错了什么吗?我很确定有一种方法可以使它更容易。

谢谢。

UPD: 我尝试了随机的高斯扰动而不是增强,它似乎并没有太大变化。 另外,当我继续研究该问题时,我注意到当我用“ reduce_mean”代替“ reduce_max”操作时,训练精度以及验证精度都开始提高。但是,该论文指出,它们以某种方式使最大损失最小化。 还有其他使最大损失最小化的方法吗?

0 个答案:

没有答案