我想做以下事情:
import theano, numpy, theano.tensor as T
a = T.fvector('a')
w = theano.shared(numpy.array([1, 2, 3, 4], dtype=theano.config.floatX))
w_sub = w[1]
b = T.sum(a * w)
grad = T.grad(b, w_sub)
这里,w_sub例如是w [1]但是我不想在w_sub的函数中明确地写出b。尽管经历了this和其他相关问题我无法解决。
这只是为了向您展示我的问题。实际上,我真正想做的是与Lasagne的稀疏卷积。权重矩阵中的零条目不需要更新,因此无需为w
的这些条目计算梯度。
现在是完整的错误消息:
Traceback (most recent call last):
File "D:/Jeroen/Project_Lasagne_General/test_script.py", line 9, in <module>
grad = T.grad(b, w_sub)
File "C:\Anaconda2\lib\site-packages\theano\gradient.py", line 545, in grad
handle_disconnected(elem)
File "C:\Anaconda2\lib\site-packages\theano\gradient.py", line 532, in handle_disconnected
raise DisconnectedInputError(message)
theano.gradient.DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: Subtensor{int64}.0
Backtrace when the node is created:
File "D:/Jeroen/Project_Lasagne_General/test_script.py", line 6, in <module>
w_sub = w[1]
答案 0 :(得分:2)
当theano编译图形时,它只会看到变量,如图中明确定义的那样。在您的示例中,w_sub
未在b
的计算中明确使用,因此不属于计算图。
使用带有以下代码的theano打印库,您可以看到这一点 {{3}}确实w_sub不是b图的一部分。
import theano
import theano.tensor as T
import numpy
import theano.d3viz as d3v
a = T.fvector('a')
w = theano.shared(numpy.array([1, 2, 3, 4], dtype=theano.config.floatX))
w_sub = w[1]
b = T.sum(a * w)
o = b, w_sub
d3v.d3viz(o, 'b.html')
要解决此问题,您需要在w_sub
的计算中明确使用b
。
然后,您将能够计算b
wrt w_sub
的渐变,并更新共享变量的值,如下例所示:
import theano
import theano.tensor as T
import numpy
a = T.fvector('a')
w = theano.shared(numpy.array([1, 2, 3, 4], dtype=theano.config.floatX))
w_sub = w[1]
b = T.sum(a * w_sub)
grad = T.grad(b, w_sub)
updates = [(w, T.inc_subtensor(w_sub, -0.1*grad))]
f = theano.function([a], b, updates=updates, allow_input_downcast=True)
f(numpy.arange(10))