使用tf.layers中定义的图层时是否可以添加L2正则化?
在我看来,由于tf.layers是一个高级包装器,因此无法轻松访问过滤器权重。
使用tf.nn.conv2d
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
weights = tf.get_variable(
name="weights",
regularizer=regularizer
)
#Previous layers
...
#Second layer
layer 2 = tf.nn.conv2d(
input,
weights,
[1,1,1,1],
[1,1,1,1])
#More layers
...
#Loss
loss = #some loss
reg_variables = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
reg_term = tf.contrib.layers.apply_regularization(regularizer, reg_variables)
loss += reg_term
现在使用tf.layers.conv2d会是什么样子?
谢谢!
答案 0 :(得分:31)
您可以将它们传递到tf.layers.conv2d
as arguments:
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
layer2 = tf.layers.conv2d(
inputs,
filters,
kernel_size,
kernel_regularizer=regularizer)
然后你应该将正规化损失添加到你的损失中:
l2_loss = tf.losses.get_regularization_loss()
loss += l2_loss
编辑:感谢Zeke Arneodo,Tom和srcolinas,我补充说,最后一点反馈,以便接受的答案提供完整的解决方案。
答案 1 :(得分:16)
你的问题不是答案吗?您还可以使用tf.losses.get_regularization_loss(https://www.tensorflow.org/api_docs/python/tf/losses/get_regularization_loss),它将收集所有REGULARIZATION_LOSSES。
...
layer2 = tf.layers.conv2d(input,
filters,
kernel_size,
kernel_regularizer= tf.contrib.layers.l2_regularizer(scale=0.1))
...
l2_loss = tf.losses.get_regularization_loss()
loss += l2_loss
答案 2 :(得分:3)
我看到两个不完整的答案,所以这是完整的答案:
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
layer2 = tf.layers.conv2d(
inputs,
filters,
kernel_size,
kernel_regularizer=regularizer)
或者:
layer2 = tf.layers.conv2d(inputs,
filters,
kernel_size,
kernel_regularizer= tf.contrib.layers.l2_regularizer(scale=0.1))
不要忘记将其添加到最终损失中:
l2_loss = tf.losses.get_regularization_loss()
....
loss += l2_loss
基本上,在定义图层时添加正则化,然后确保将正则化损失添加到损失中。
答案 3 :(得分:0)
渴望执行有两种方法。
tf.add_n([tf.square(i) for i in layer.variables]) * l2_coef
手动计算
。layer.losses
创建图层时使用kernel_regularizer
。如官方示例所示:densenet_test.py
rand_input = tf.random_uniform((10, 3, 32, 32))
weight_decay = 1e-4
conv = tf.keras.layers.Conv2D(
3, (3, 3),
padding='same',
use_bias=False,
kernel_regularizer=tf.keras.regularizers.l2(weight_decay))
optimizer = tf.train.GradientDescentOptimizer(0.1)
conv(rand_input) # Initialize the variables in the layer
def compute_true_l2(vs, wd):
return tf.reduce_sum(tf.square(vs)) * wd
true_l2 = compute_true_l2(conv.variables, weight_decay)
keras_l2 = tf.add_n(conv.losses)
self.assertAllClose(true_l2, keras_l2)
with tf.GradientTape() as tape_true, tf.GradientTape() as tape_keras:
loss = tf.reduce_sum(conv(rand_input))
loss_with_true_l2 = loss + compute_true_l2(conv.variables, weight_decay)
loss_with_keras_l2 = loss + tf.add_n(conv.losses)
true_grads = tape_true.gradient(loss_with_true_l2, conv.variables)
keras_grads = tape_keras.gradient(loss_with_keras_l2, conv.variables)
self.assertAllClose(true_grads, keras_grads)