tensorflow中的stop_gradient

时间:2018-05-07 20:15:37

标签: tensorflow tensorflow-gradient

我想知道tf.stop_gradient是否会停止给定op的梯度计算,还是停止更新其输入tf.variable?我有以下问题 - 在MNIST中的前向路径计算期间,我想对权重执行一组操作(比如说W到W *),然后用输入做一个matmul。但是,我想从后向路径中排除这些操作。我只想在训练期间使用反向传播计算dE / dW。我写的代码阻止W得到更新。你能帮我理解为什么吗?如果这些是变量,我理解我应该将他们的可训练属性设置为false,但这些是对权重的操作。如果stop_gradient不能用于此目的,那么如何构建两个图形,一个用于前向路径,另一个用于反向传播?

def build_layer(inputs, fmap, nscope,layer_size1,layer_size2, faulty_training):  
  with tf.name_scope(nscope): 
    if (faulty_training):
      ## trainable weight
      weights_i = tf.Variable(tf.truncated_normal([layer_size1, layer_size2],stddev=1.0 / math.sqrt(float(layer_size1))),name='weights_i')
      ## Operations on weight whose gradient should not be computed during backpropagation
      weights_fx_t = tf.multiply(268435456.0,weights_i)
      weight_fx_t = tf.stop_gradient(weights_fx_t)
      weights_fx = tf.cast(weights_fx_t,tf.int32)
      weight_fx = tf.stop_gradient(weights_fx)
      weights_fx_fault = tf.bitwise.bitwise_xor(weights_fx,fmap)
      weight_fx_fault = tf.stop_gradient(weights_fx_fault)
      weights_fl = tf.cast(weights_fx_fault, tf.float32)
      weight_fl = tf.stop_gradient(weights_fl)
      weights = tf.stop_gradient(tf.multiply((1.0/268435456.0),weights_fl))
      ##### end transformation
    else:
      weights = tf.Variable(tf.truncated_normal([layer_size1, layer_size2],stddev=1.0 / math.sqrt(float(layer_size1))),name='weights')


    biases = tf.Variable(tf.zeros([layer_size2]), name='biases')
    hidden = tf.nn.relu(tf.matmul(inputs, weights) + biases)
    return weights,hidden

我正在使用tensorflow梯度下降优化器来进行训练。

optimizer = tf.train.GradientDescentOptimizer(learning_rate) 
global_step = tf.Variable(0, name='global_step', trainable=False) 
train_op = optimizer.minimize(loss, global_step=global_step)

1 个答案:

答案 0 :(得分:1)

停止渐变将阻止反向传播继续经过图中的该节点。您的代码没有从weights_i到损失的任何路径,除了经过weights_fx_t且渐变停止的路径。这是导致在训练期间不更新weights_i的原因。你不需要在每一步之后放置stop_gradient。只用一次就可以阻止那里的反向传播。

如果 String temp = sc.nextLine(); int locID = Integer.parseInt(temp); 没有做您想做的事情,那么您可以通过const { formValues } = this.state const arrayValues = [1,2,4] const newFormValues = { ...formValues, q3: [...formValues[q3], arrayValues]} this.setState({ formValues: newFormValues }) 获取渐变,然后您可以使用const { formValues } = this.state const arrayValues = [1,2,4] const newFormValues = { ...formValues, q3: arrayValues} this.setState({ formValues: newFormValues }) 编写自己的更新操作。这将允许您根据需要更改渐变。