Question

我想分别计算权重变量和偏差项的梯度张量。权重变量的梯度计算正确，但偏差的梯度计算得不好。请让我知道问题出在哪里，或正确修改我的代码。

import numpy as np
import tensorflow as tf

X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ]) 
Bb = b1* Bb

Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])

W = tf.Variable([ [1.0], [1.0], [1.0] ])

with tf.GradientTape() as tape: 
    Y = tf.matmul(X, W) + Bb
    print("Y : ", Y.numpy())

    loss_val = tf.reduce_sum(tf.square(Y - Y0))  
    print("loss : ", loss_val.numpy())

gw = tape.gradient(loss_val, W)   # gradient calculation works well 
gb = tape.gradient(loss_val, b1)  # does NOT work

print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())

Answer 1

两件事。首先，如果您在这里查看文档-

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape#args

您会看到，除非gradient，否则您只能对persistent=True进行一次呼叫

第二，您正在磁带的上下文管理器外部设置Bb = b1* Bb，因此不会记录此操作。

import numpy as np
import tensorflow as tf

X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ]) 


Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])

W = tf.Variable([ [1.0], [1.0], [1.0] ])

with tf.GradientTape(persistent=True) as tape: 
    Bb = b1* Bb
    Y = tf.matmul(X, W) + Bb
    print("Y : ", Y.numpy())

    loss_val = tf.reduce_sum(tf.square(Y - Y0))  
    print("loss : ", loss_val.numpy())

gw = tape.gradient(loss_val, W)   # gradient calculation works well 
gb = tape.gradient(loss_val, b1)  # does NOT work

print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())

使用GradientTape（）对偏差项进行梯度计算

1 个答案: