Question

我有一个用Theano构建的计算图。它是这样的：

import theano
from theano import tensor as T
import numpy as np

W1 = theano.shared( np.random.rand(45,32).astype('float32'), 'W1')
b1 = theano.shared( np.random.rand(32).astype('float32'), 'b1')
W2 = theano.shared( np.random.rand(32,3).astype('float32'), 'W2')
b2 = theano.shared( np.random.rand(3).astype('float32'), 'b2')

input  = T.matrix('input')
hidden = T.tanh(T.dot(input, W1)+b1)
output = T.nnet.softmax(T.dot(hidden, W2)+b2)

现在，从矢量到矢量的映射。但是，输入被设置为矩阵类型，因此我可以同时通过映射传递许多向量。我正在做一些机器学习，这使学习阶段更有效率。

问题是在学习阶段之后，我想将映射视为矢量向量，以便我可以计算：

jac = theano.gradient.jacobian(output, wrt=input)

jacobian抱怨输入不是TensorType(float32, vector)。有没有办法可以在不重建整个计算图的情况下更改输入张量类型？

Answer 1

从技术上讲，这是一个可能的解决方案：

import theano
from theano import tensor as T
import numpy as np

W1 = theano.shared( np.random.rand(45,32).astype('float32'), 'W1')
b1 = theano.shared( np.random.rand(32).astype('float32'), 'b1')
W2 = theano.shared( np.random.rand(32,3).astype('float32'), 'W2')
b2 = theano.shared( np.random.rand(3).astype('float32'), 'b2')

input  = T.vector('input') # it will be reshaped!
hidden = T.tanh(T.dot(input.reshape((-1, 45)), W1)+b1)
output = T.nnet.softmax(T.dot(hidden, W2)+b2)

#Here comes the trick
jac = theano.gradient.jacobian(output.reshape((-1,)), wrt=input).reshape((-1, 45, 3))

这样jac.eval({input: np.random.rand(10*45)}).shape会产生(100, 45, 3)！

问题是它计算批量索引的衍生物。因此理论上，第一个1x45数字可以影响所有10x3输出（一批长度为10）。

为此，有几种解决方案。您可以在前两个轴上采用对角线，但不幸的是Theano does not implement it，numpy does！

我认为可以使用scan来完成，但这是另一回事。

转换theano张量类型

1 个答案: