Question

我有很多关于正规化和偏见在咖啡馆工作方式的问题。

首先，默认情况下网络中存在偏差，是不是？或者，我需要问caffe添加它们吗？

其次，当它获得损失值时，它不考虑正则化。这样对吗？我的意思是损失只包含损失函数值。据我所知，它只考虑梯度计算中的正则化。是不是？

第三，当caffe获得梯度时，它是否也考虑了正则化中的偏差值？或者只是考虑正规化中网络的权重？

提前致谢，

阿夫欣

Answer 1

对于你的3个问题，我的答案是：

是。默认情况下，网络中存在偏差。例如，在ConvolutionParameter中的InnerProductParameter和caffe.proto中，bias_term的默认值为true，这意味着convolution/innerproduct网络中的图层默认会有偏差。
是。由损失层获得的损失值不包含正则化项的值。它只是在调用函数net_->ForwardBackward()后实际上在ApplyUpdate()函数中考虑正则化，其中更新网络参数。

在网络中采用卷积层，例如：

layer {
  name: "SomeLayer"
  type: "Convolution"
  bottom: "data"
  top: "conv"
  #for weights
  param {
    lr_mult: 1 
    decay_mult: 1.0 #coefficient of regularization for weights
                    #default is 1.0, here is for the sake of clarity  
  }
  #for bias
  param {
    lr_mult: 2
    decay_mult: 1.0 #coefficient of regularization for bias
                    #default is 1.0, here is for the sake of clarity 
  } 
  ...  #left 
}

这个问题的答案是：当caffe获得梯度时，只有当2个变量：上面的第二个decay_mult和{中的weight_decay时，求解器才会考虑正则化中的偏差值。 {1}}都大于零。

详细信息可以在functoin void SGDSolver::Regularize()中找到。

希望这会对你有所帮助。

caffe是否将正则化参数乘以偏差？

1 个答案: