Question

我有一个非常简单的致密层模型，该模型需要10个输入值，20个隐藏层单位，1个输出层单位和“ relu”作为激活函数，学习率为0.01的adam优化器。

densemodel=keras_model_sequential();
layer_dense(densemodel, input_shape=ncol(trainingX), units=20, activation="relu")       
layer_dropout(densemodel, rate=0.1)
layer_dense(densemodel, units=1, activation="relu")

optimizer=optimizer_adam(lr=0.01,clipnorm=1);
compile(densemodel, optimizer=optimizer, loss="logcosh", metrics = list("mean_squared_error"))

我使用n = 2e4训练数据对模型进行训练，并遇到严重的梯度爆炸，最终由训练记录中的一些异常值（n <10）证实了这一点。

在不删除异常值记录的情况下，以下任何一种策略或组合策略都无法解决梯度爆炸问题。

kernel_regularizer，bias_regularizer，activity_regularizer，clipnorm = 1，clipvalue = 0.5或0.1，将学习速率设置为1e-5，添加退出层，增加批次大小。基本上它们都不起作用。我希望至少应该根据定义将clipnorm或clipvalue起作用

clipnorm: Gradients will be clipped when their L2 norm exceeds this
          value.

clipvalue: Gradients will be clipped when their absolute value exceeds
          this value.

但是为什么他们失败了？

Keras致密模型梯度爆炸

0 个答案: