我有一个函数,它接受两个Tensorflow向量和一个标量阈值,并返回Tensorflow操作。以下版本抛出了一个" ValueError:没有为任何变量提供渐变"。
def mse(expected, probs, threshold):
preds = tf.to_double(probs >= threshold)
loss_vect = tf.square(expected - preds)
loss = -tf.reduce_mean(loss_vect)
return loss
但是,如果删除第一行,导致该函数的以下版本,则不会引发错误。
def mse(expected, probs, threshold):
loss_vect = tf.square(expected - probs)
loss = -tf.reduce_mean(loss_vect)
return loss
我调用该函数的上下文如下。上面的函数作为loss_func传入。对于act_func,我传入一个返回tf.sigmoid操作的函数。
class OneLayerNet(object):
def __init__(self, num_feats, num_outputs, act_func, threshold, loss_func, optimizer, batch_size=8, epochs=100, eta=0.01, reg_const=0):
self.batch_size = batch_size
self.epochs = epochs
self.eta = eta
self.reg_const = reg_const
self.x = tf.sparse_placeholder(tf.float64, name="placeholderx") # num_sents x num_feats
self.y = tf.placeholder(tf.float64, name="placeholdery") # 1 x num_sents
self.w = tf.get_variable("W", shape=[num_feats, num_outputs], initializer=tf.contrib.layers.xavier_initializer(), dtype=tf.float64)
self.b = tf.Variable(tf.zeros([num_outputs], dtype=tf.float64))
self.probs = act_func(self.x, self.w, self.b)
self.loss = loss_func(self.y, self.probs, threshold)
self.optimizer = optimizer(self.eta, self.loss)
self.session = tf.Session()
self.session.run(tf.global_variables_initializer())
从其他答案中,我知道我得到的ValueError意味着我的权重向量w和我的优化器的路径被破坏了。我想知道为什么在添加tf.to_double调用时路径会中断。
答案 0 :(得分:1)
The problem does not come from to_double
but from the fact that you are thesholding probs
.
When you compute probs >= threshold
, the result is binary. Computing the gradient of this expression w.r.t. probs
does not make much sense because it is 0 almost everywhere, except where it is infinite.
Converting the result to double
will unfortunately not change the situation with respect to that point.