如果在网络正向传递的部分中使用权重张量之后将赋值操作应用于权重张量,TensorFlow的反向传播是否考虑了赋值操作确定该重量的梯度时?例如,如果我有
weights = tf.Variable(...)
bias = tf.Variable(...)
output = tf.tanh(tf.matmul(weights, input) + bias)
weight_assign_op = weights.assign(weights + 1.0)
with tf.control_dependencies(weight_assign_op):
output2 = tf.identity(output)
计算输出,然后对权重进行更改。如果输出随后用于计算损失和梯度以更新变量,那么是否会考虑到weights
的变化来创建渐变?也就是说,weights
的渐变是old_weights + 1.0
的正确渐变还是仍然是old_weights
的渐变,当应用于新的weights
赢得时,8 => 2
1 => 2
9 => 2
必须是"正确"梯度下降的梯度?
答案 0 :(得分:1)
我最终通过实验测试了它。渐变计算确实将assign op考虑在内。我用下面的代码来测试。运行它会产生正梯度。注释掉权重分配操作线和控制依赖性线导致负梯度。这是因为在0.0
分配后,要求考虑原始起始值权重2.0
或更新后的权重。
import tensorflow as tf
data = [[1.0], [2.0], [3.0]]
labels = [[1.0], [2.1], [2.9]]
input_data = tf.placeholder(dtype=tf.float32, shape=[3, 1])
input_labels = tf.placeholder(dtype=tf.float32, shape=[3, 1])
weights = tf.Variable(tf.constant([0.0]))
bias = tf.Variable(tf.constant([0.0]))
output = (weights * input_data) + bias
weight_assign_op = weights.assign(tf.constant([2.0]))
with tf.control_dependencies([weight_assign_op]):
output = tf.identity(output)
loss = tf.reduce_sum(tf.norm(output - input_labels))
weight_gradient = tf.gradients(loss, weights)
initialize_op = tf.global_variables_initializer()
session = tf.Session()
session.run([initialize_op])
weight_gradient_value = session.run([weight_gradient], feed_dict={input_data: data, input_labels: labels})
print(weight_gradient_value)