我对活动正则化的理解是,它会查看各层的输出,然后在该术语中添加正则化损失。与此行为类似:
loss = loss + activity_regularization_function(output_layer)
让我们考虑在tf.layer中使用activity_regularizer
标志,如下所示:
tf.layers.conv1d(..., activity_regularizer = l2_regularizer(1e-5))
...
regularization_loss = tf.reduce_sum(tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES))
loss = loss + regularization_loss
我想确认对激活不应有任何实际影响,对吗?