Tensorflow没有为任何变量tf.to_double提供渐变

时间:2018-05-28 18:20:17

标签: python tensorflow

我有一个函数,它接受两个Tensorflow向量和一个标量阈值,并返回Tensorflow操作。以下版本抛出了一个" ValueError:没有为任何变量提供渐变"。

def mse(expected, probs, threshold):
    preds = tf.to_double(probs >= threshold)
    loss_vect = tf.square(expected - preds)
    loss = -tf.reduce_mean(loss_vect)
    return loss

但是,如果删除第一行,导致该函数的以下版本,则不会引发错误。

def mse(expected, probs, threshold):
    loss_vect = tf.square(expected - probs)
    loss = -tf.reduce_mean(loss_vect)
    return loss

我调用该函数的上下文如下。上面的函数作为loss_func传入。对于act_func,我传入一个返回tf.sigmoid操作的函数。

class OneLayerNet(object):
    def __init__(self, num_feats, num_outputs, act_func, threshold, loss_func, optimizer, batch_size=8, epochs=100, eta=0.01, reg_const=0):
        self.batch_size = batch_size
        self.epochs = epochs
        self.eta = eta
        self.reg_const = reg_const

        self.x = tf.sparse_placeholder(tf.float64, name="placeholderx") # num_sents x num_feats
        self.y = tf.placeholder(tf.float64, name="placeholdery") # 1 x num_sents
        self.w = tf.get_variable("W", shape=[num_feats, num_outputs], initializer=tf.contrib.layers.xavier_initializer(), dtype=tf.float64)
        self.b = tf.Variable(tf.zeros([num_outputs], dtype=tf.float64))

        self.probs = act_func(self.x, self.w, self.b)
        self.loss = loss_func(self.y, self.probs, threshold)
        self.optimizer = optimizer(self.eta, self.loss)
        self.session = tf.Session()
        self.session.run(tf.global_variables_initializer())

从其他答案中,我知道我得到的ValueError意味着我的权重向量w和我的优化器的路径被破坏了。我想知道为什么在添加tf.to_double调用时路径会中断。

1 个答案:

答案 0 :(得分:1)

The problem does not come from to_double but from the fact that you are thesholding probs.

When you compute probs >= threshold, the result is binary. Computing the gradient of this expression w.r.t. probs does not make much sense because it is 0 almost everywhere, except where it is infinite.

Converting the result to double will unfortunately not change the situation with respect to that point.