Question

我尝试使用SGD，Adadelta，Adabound和Adam。一切都给我带来了验证准确性的波动。我在keras中尝试了所有激活功能，但仍然有val_acc的波动。
培训样本：1352
验证样本：339
Validation Accuracy

        # first (and only) CONV => RELU => POOL block
        inpt = Input(shape = input_shape)
        x = Conv2D(32, (3, 3), padding = "same")(inpt)
        x = Activation("swish")(x)
        x = BatchNormalization(axis = channel_dim)(x)
        x = MaxPooling2D(pool_size = (3, 3))(x)
        # x = Dropout(0.25)(x)

        # first CONV => RELU => CONV => RELU => POOL block
        x = Conv2D(64, (3, 3), padding = "same")(x)
        x = Activation("swish")(x)
        x = BatchNormalization(axis = channel_dim)(x)
        x = Conv2D(64, (3, 3), padding = "same")(x)
        x = Activation("swish")(x)
        x = BatchNormalization(axis = channel_dim)(x)
        x = MaxPooling2D(pool_size = (2, 2))(x)
        # x = Dropout(0.25)(x)

        # second CONV => RELU => CONV => RELU => POOL Block
        x = Conv2D(128, (3, 3), padding = "same")(x)
        x = Activation("swish")(x)
        x = BatchNormalization(axis = channel_dim)(x)
        x = Conv2D(128, (3, 3), padding = "same")(x)
        x = Activation("swish")(x)
        x = BatchNormalization(axis = channel_dim)(x)
        x = MaxPooling2D(pool_size = (2, 2))(x)
        # x = Dropout(0.25)(x)

        # first (and only) FC layer
        x = Flatten()(x) # Change to GlobalMaxPooling2D
        x = Dense(256, activation = 'swish')(x)
        x = BatchNormalization(axis = channel_dim)(x)
        x = Dropout(0.4)(x)

        x = Dense(128, activation = 'swish')(x)
        x = BatchNormalization()(x)
        x = Dropout(0.4)(x)

        x = Dense(64, activation = 'swish')(x)
        x = BatchNormalization()(x)
        x = Dropout(0.3)(x)

        x = Dense(32, activation = 'swish')(x)
        x = BatchNormalization()(x)

        x = Dense(nc, activation = 'softmax')(x)
        model  = Model(inputs=inpt, outputs = x)

model.compile(loss = 'categorical_crossentropy', optimizer = 'sgd', metrics = ['accuracy'])

Answer 1

您的模型可能对噪声过于敏感，请参阅此answer。

根据链接中的答案以及我从模型中看到的内容，对于您拥有的数据量，网络可能太深（大型模型且数据不足==>过度拟合==>噪声敏感性）。我建议使用更简单的模型进行健全性检查。

学习率也可能是一个可能的原因（如Neb所述）。您使用的是sgd的默认学习率（为0.01，可能太高）。尝试使用1.e-3或更低版本。

如何解决验证准确性波动的问题？

1 个答案: