为什么在进入具有多重损失的更深层网络时损失会增加?

时间:2017-03-30 11:01:19

标签: machine-learning neural-network deep-learning caffe

我正在使用ResNet方案来训练我的网络架构。在每16层之后,我使用了去卷积和损耗层(SoftmaxWithLoss)。当进入更深层时,我发现我的损失(loss3 and loss4)增加与loss2相比。为什么会这样?如何为每个损失层选择减重?现在,我将loss_weight=0.1用于所有损失图层,但loss_main

除外
Iteration 9960, loss = 0.287316
    Train net output #0: loss_main = 0.0921776 (* 1 = 0.0921776 loss)
    Train net output #1: loss1 = 0.259363 (* 0.1 = 0.0259363 loss)
    Train net output #2: loss2 = 0.14823 (* 0.1 = 0.014823 loss)
    Train net output #3: loss3 = 0.169563 (* 0.1 = 0.0169563 loss)
     Train net output #4: loss4 = 0.21544 (* 0.1 = 0.021544 loss)
Iteration 9980, lr = 0.002
Iteration 9980, loss = 0.286957
     Train net output #0: loss_main = 0.151433 (* 1 = 0.151433 loss)
     Train net output #1: loss1 = 0.362414 (* 0.1 = 0.0362414 loss)
     Train net output #2: loss2 = 0.267339 (* 0.1 = 0.0267339 loss)
     Train net output #3: loss3 = 0.304756 (* 0.1 = 0.0304756 loss)
     Train net output #4: loss4 = 0.393892 (* 0.1 = 0.0393892 loss)
Iteration 10000, lr = 0.002
Iteration 10000, loss = 0.287502
     Train net output #0: loss_main = 0.149631 (* 1 = 0.149631 loss)
     Train net output #1: loss1 = 0.377756 (* 0.1 = 0.0377756 loss)
     Train net output #2: loss2 = 0.252874 (* 0.1 = 0.0252874 loss)
     Train net output #3: loss3 = 0.26978 (* 0.1 = 0.026978 loss)
     Train net output #4: loss4 = 0.355817 (* 0.1 = 0.0355817 loss)

1 个答案:

答案 0 :(得分:4)

损失不是单调递减函数;它上下变化。只要整体趋势向下,培训就会按预期进行。由于您没有为行为提供足够长的基线,我无法全面判断您的模型是否存在问题。

您可以在源代码中调整损失权重 - 当然 - 但我不推荐它。我想你也可以在train_val.prototxt中覆盖它,但我现在找不到这个参考。