Question

我试图像他在博士论文中所做的那样实施体重噪声正规化，但我在如何实现这一点上有几个问题。该算法应该看起来像

 while stopping criteria not met do
   Randomize training set order
   for each example in the training set do
     Add zero mean Gaussian Noise to weights
     Run forward and backward pass to calculate the gradient
     Restore original weights
     Update weights with gradient descent algorithm

有人可以解释一下吗？

编辑2016年6月16日

这是我的代码：

# e.g: log filter bank or MFCC features
# Has size [batch_size, max_stepsize, num_features], but the
# batch_size and max_stepsize can vary along each step
inputs = tf.placeholder(tf.float32, [None, None, num_features])

# Here we use sparse_placeholder that will generate a
# SparseTensor required by ctc_loss op.
targets = tf.sparse_placeholder(tf.int32)

# 1d array of size [batch_size]
seq_len = tf.placeholder(tf.int32, [None])

# Defining the cell
# Can be:
#   tf.nn.rnn_cell.RNNCell
#   tf.nn.rnn_cell.GRUCell
cell = tf.nn.rnn_cell.LSTMCell(num_hidden, state_is_tuple=True)

# Stacking rnn cells
stack = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers,
                                    state_is_tuple=True)

# The second output is the last state and we will no use that
outputs, _ = tf.nn.dynamic_rnn(cell, inputs, seq_len, dtype=tf.float32)

shape = tf.shape(inputs)
batch_s, max_timesteps = shape[0], shape[1]

# Reshaping to apply the same weights over the timesteps
outputs = tf.reshape(outputs, [-1, num_hidden])

# Truncated normal with mean 0 and stdev=0.1
# Tip: Try another initialization
# see https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.layers.html#initializers
W = tf.Variable(tf.truncated_normal([num_hidden,
                                     num_classes],
                                    stddev=0.1))
# Zero initialization
# Tip: Is tf.zeros_initializer the same?
b = tf.Variable(tf.constant(0., shape=[num_classes]))

# Doing the affine projection
logits = tf.matmul(outputs, W) + b

# Reshaping back to the original shape
logits = tf.reshape(logits, [batch_s, -1, num_classes])

# Time major
logits = tf.transpose(logits, (1, 0, 2))

loss = tf.contrib.ctc.ctc_loss(logits, targets, seq_len)
cost = tf.reduce_mean(loss)

optimizer = tf.train.MomentumOptimizer(initial_learning_rate,
                                       0.9).minimize(cost)

# Option 2: tf.contrib.ctc.ctc_beam_search_decoder
# (it's slower but you'll get better results)
decoded, log_prob = tf.contrib.ctc.ctc_greedy_decoder(logits, seq_len)

# Inaccuracy: label error rate
ler = tf.reduce_mean(tf.edit_distance(tf.cast(decoded[0], tf.int32),
                                      targets))

编辑09/27/16

我意识到我必须更改我的优化器才能添加噪声权重正则化器。但是，我不知道如何在我的代码中插入它。

    variables = tf.trainable_variables()

    with tf.variable_scope(self.name or "OptimizeLoss", [loss, global_step]):

        update_ops = set(ops.get_collection(ops.GraphKeys.UPDATE_OPS))

        # Make sure update ops are ran before computing loss.
        if update_ops:
            loss = control_flow_ops.with_dependencies(list(update_ops), loss)

        add_noise_ops = [tf.no_op()]
        if self.weights_noise_scale is not None:
            add_noise_ops, remove_noise_ops = self._noise_ops(variables, self.weights_noise_scale)

            # Make sure add noise to weights before computing loss.
            loss = control_flow_ops.with_dependencies(add_noise_ops, loss)


        # Compute gradients.
        gradients = self._opt.compute_gradients(loss, variables, colocate_gradients_with_ops=self.colocate_gradients_with_ops)

        # Optionally add gradient noise.
        if self.gradient_noise_scale is not None:
            gradients = self._add_scaled_noise_to_gradients(gradients, self.gradient_noise_scale)

        # Optionally clip gradients by global norm.
        if self.clip_gradients_by_global_norm is not None:
            gradients = self._clip_gradients_by_global_norm(gradients, self.clip_gradients_by_global_norm)

        # Optionally clip gradients by value.
        if self.clip_gradients_by_value is not None:
            gradients = self._clip_gradients_by_value(gradients, self.clip_gradients_by_value)

        # Optionally clip gradients by norm.
        if self.clip_gradients_by_norm is not None:
            gradients = self._clip_gradients_by_norm(gradients, self.clip_gradients_by_norm)

        self._grads = [g[0] for g in gradients]
        self._vars = [g[1] for g in gradients]

        # Create gradient updates.
        # Make sure that the noise of weights will be removed before the  gradient update rule
        grad_updates = self._opt.apply_gradients(gradients,
                                               global_step=global_step,
                                               name="train")


        # Ensure the train_tensor computes grad_updates.
        train_tensor = control_flow_ops.with_dependencies([grad_updates], loss)

有人能对我有所了解吗？谢谢:)）

Answer 1

要解决这个问题，我会建立2个图表：一个用于培训，另一个用于评估。后者不会将噪音与权重相加。要将随机噪声与权重相加，您可以这样做：

W = tf.Variable(tf.truncated_normal([num_hidden,
                                     num_classes],
                                    stddev=0.1))
noise = tf.truncated_normal([num_hidden, num_classes],
                            stddev=0.001))
W = W + noise

张量tf.truncated_normal会为你的体重添加少量随机噪音。

张量流 - 重量噪声正则化

1 个答案: