如何在Tensorflow中实现早期停止并降低高原的学习率?

时间:2019-05-13 05:41:18

标签: python tensorflow

我想为使用EarlyStopping构建的神经网络模型实现两个回调ReduceLearningRateOnPlateautensorflow。 (我没有使用Keras

下面的示例代码是我如何在编写的脚本中实现早期停止,我不知道它是否正确。

# A list to record loss on validation set
val_buff = []
# If early_stop == True, then terminate training process
early_stop = False

while icount < maxEpoches:

    '''Shuffle the training set'''
    '''Update the model by using Adam optimizer over the entire training set'''

    # Evaluate loss on validation set
    val_loss = self.sess.run(self.loss, feed_dict = feeddict_val)
    val_buff.append(val_loss)

    if icount % ep == 0:

        diff = np.array([val_buff[ind] - val_buff[ind - 1] for ind in range(1, len(val_buff))])
        bad = len(diff[diff > 0])
        if bad > 0.5 * len(diff):
            early_stop = True

        if early_stop:
            self.saver.save(self.sess, 'model.ckpt')
            raise OverFlow()
        val_buff = []

    icount += 1

当我训练模型并跟踪验证集上的损失时,我发现损失会上升和下降,因此很难确定模型何时开始过拟合。

由于EarlystoppingReduceLearningRateOnPlateau非常相似,如何修改上面的代码来实现ReduceLearningRateOnPlateau

1 个答案:

答案 0 :(得分:0)

振荡误差/损耗很常见。实施提前停止或学习率降低规则的主要问题是验证损失的计算相对较晚。为了解决这个问题,我可能建议下一条规则:当最佳验证错误至少过去N个纪元时停止训练。

max_stagnation = 5 # number of epochs without improvement to tolerate
best_val_loss, best_val_epoch = None, None

for epoch in range(max_epochs):
    # train an epoch ...
    val_loss = evaluate()
    if best_val_loss is None or best_val_loss < val_loss:
        best_val_loss, best_val_epoch = val_loss, epoch
    if best_val_epoch < epoch - max_stagnation:
        # nothing is improving for a while
        early_stop = True
        break