Question

我正在尝试实施Tensorflow tutorial on custom training上给出的教程。由于某些原因，dW和DB为None。我不明白为什么t.gradient（）返回None。

import tensorflow as tf
tf.enable_eager_execution()

class Model(object):
    def __init__(self):
        self.W = tf.Variable(5.0)
        self.b = tf.Variable(0.0)
    def __call__(self,x):
        return self.W*x+self.b
    def loss_function(self, y_true, y_predicted):
        return tf.reduce_mean(tf.square(y_predicted-y_true))
    def train(self, inputs, outputs, learning_rate):
        with tf.GradientTape() as t:
            current_loss = self.loss_function(inputs,outputs)
        dW,db = t.gradient(current_loss,[self.W, self.b])
        ## dW and db returns None
        self.W.assign_sub(learning_rate*dW)
        self.b.assign_sub(learning_rate*db)

但是当train不是模型方法时，以下代码可以正常工作。有什么原因吗？

import tensorflow as tf
tf.enable_eager_execution()

class Model(object):
    def __init__(self):
        self.W = tf.Variable(5.0)
        self.b = tf.Variable(0.0)
    def __call__(self,x):
        return self.W*x+self.b
    def loss_function(self, y_true, y_predicted):
        return tf.reduce_mean(tf.square(y_predicted-y_true))

def train(model, inputs, outputs, learning_rate):
    with tf.GradientTape() as t:
        current_loss = model.loss_function(model(inputs),outputs)
    dW,db = t.gradient(current_loss,[model.W, model.b])
    ## dW and db returns None
    model.W.assign_sub(learning_rate*dW)
    model.b.assign_sub(learning_rate*db)

Answer 1

要使gradient工作，需要在GradientTape范围内制作整个图形。

例如，在Tensorflow tutorial on custom training中提供的代码中：

with tf.GradientTape() as t:
    current_loss = model.loss_function(model(inputs),outputs)

将current_loss连接到模型变量（model.W和model.B）的图在t的范围内构造。

如果按照以下说明更改本教程提供的代码：

logits = model(inputs)
with tf.GradientTape() as t:
    current_loss = model.loss_function(logits, outputs)

您将获得None和dW的{{1}}。

Tensorflow梯度返回null

1 个答案: