我想分别计算权重变量和偏差项的梯度张量。权重变量的梯度计算正确,但偏差的梯度计算得不好。请让我知道问题出在哪里,或正确修改我的代码。
import numpy as np
import tensorflow as tf
X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ])
Bb = b1* Bb
Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])
W = tf.Variable([ [1.0], [1.0], [1.0] ])
with tf.GradientTape() as tape:
Y = tf.matmul(X, W) + Bb
print("Y : ", Y.numpy())
loss_val = tf.reduce_sum(tf.square(Y - Y0))
print("loss : ", loss_val.numpy())
gw = tape.gradient(loss_val, W) # gradient calculation works well
gb = tape.gradient(loss_val, b1) # does NOT work
print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())
答案 0 :(得分:1)
两件事。首先,如果您在这里查看文档-
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape#args
您会看到,除非gradient
,否则您只能对persistent=True
进行一次呼叫
第二,您正在磁带的上下文管理器外部设置Bb = b1* Bb
,因此不会记录此操作。
import numpy as np
import tensorflow as tf
X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ])
Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])
W = tf.Variable([ [1.0], [1.0], [1.0] ])
with tf.GradientTape(persistent=True) as tape:
Bb = b1* Bb
Y = tf.matmul(X, W) + Bb
print("Y : ", Y.numpy())
loss_val = tf.reduce_sum(tf.square(Y - Y0))
print("loss : ", loss_val.numpy())
gw = tape.gradient(loss_val, W) # gradient calculation works well
gb = tape.gradient(loss_val, b1) # does NOT work
print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())