我正在尝试实现自定义辍学层。在正向传播期间,我希望我的输入按原样传递而不会出现任何丢失。在向后传递期间,我只想更新某些输入的梯度,而冻结其他输入的梯度。这将基于一种概率,该概率决定要更新的梯度和冻结的梯度。
我已经实现了自定义层,但是由于修改很微妙,因此很难验证它是否正确。错误的实现有可能获得合理的输出。我已经修改了Keras中现有的辍学功能。
class MyDropout(Layer):
"""Applies Dropout to the input.
Dropout consists in randomly setting
a fraction `rate` of input units to 0 at each update during training time,
which helps prevent overfitting.
# Arguments
rate: float between 0 and 1. Fraction of the input units to drop.
noise_shape: 1D integer tensor representing the shape of the
binary dropout mask that will be multiplied with the input.
For instance, if your inputs have shape
`(batch_size, timesteps, features)` and
you want the dropout mask to be the same for all timesteps,
you can use `noise_shape=(batch_size, 1, features)`.
seed: A Python integer to use as random seed.
# References
- [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](
http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf)
"""
def __init__(self, rate, noise_shape=None, seed=None, **kwargs):
super(MyDropout, self).__init__(**kwargs)
self.rate = min(1., max(0., rate))
self.noise_shape = noise_shape
self.seed = seed
self.supports_masking = True
def _get_noise_shape(self, inputs):
if self.noise_shape is None:
return self.noise_shape
symbolic_shape = keras.backend.shape(inputs)
noise_shape = [symbolic_shape[axis] if shape is None else shape
for axis, shape in enumerate(self.noise_shape)]
return tuple(noise_shape)
def call(self, inputs, training=None):
if 0. < self.rate < 1.:
noise_shape = self._get_noise_shape(inputs)
# generate random number of same shape as input
uniform_random_number = keras.backend.random_normal(shape=keras.backend.shape(inputs))
# check where the random number if greater than the dropout rate
indices_greater_than = tf.greater(uniform_random_number,self.rate,name = 'stoppedGradientLocations')
indices_greater_than = tf.cast(indices_greater_than,dtype=tf.float32)
inputs_copy = tf.identity(inputs)
out1 = tf.stop_gradient(inputs_copy*indices_greater_than)
indices_less_than= 1 - indices_greater_than
out2 = inputs*indices_less_than
out_total = out1 + out2
return out_total
def get_config(self):
config = {'rate': self.rate,
'noise_shape': self.noise_shape,
'seed': self.seed}
base_config = super(Dropout, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def compute_output_shape(self, input_shape):
return input_shape
验证我的实现的最佳方法是什么-代码是否按预期工作?