Question

我是Theano的初学者，我正在研究另一个代码的例子，据说可能在某些时候起作用（但是，我修改了它......但我很确定我的修改与目前出现的问题无关）。

无论如何，我正在尝试调试Theano Scan ......我认为我观察到的是扫描功能中的一个基本错误。

U, V, W = self.U, self.V, self.W
x = T.ivector('x')
y = T.ivector('y')
def forward_prop_step(x_t, s_t_prev, U, V, W):
    s_t = T.tanh(U.dot(x_t) + V.dot(s_t_prev))
    o_t = T.tanh(W.dot(s_t))
    return [o_t,s_t]
[o,s], updates = theano.scan(
        forward_prop_step,
        sequences=x,
        outputs_info=[None, dict(initial=T.zeros(self.hidden_dim))],
        non_sequences=[U, V, W],
        truncate_gradient=self.bptt_truncate,
        strict=True)

U是m x n矩阵，V是n x n矩阵，W是n x o矩阵... { {1}}是标量（4）。但我不认为我的功能内部是目前失败的。

我得到的错误是：

ValueError：在编译扫描的内部函数时遇到以下错误：变量IncSubtensor {Set;：int64：}。0（参数编号1）的初始状态（扫描命名法中的self.bptt_truncate） 2维（s），而内部函数（outputs_info）的结果有2维（应该比初始状态小1）。

我尝试更改outputs_info的维度和fn的返回维度，但到目前为止似乎没有任何工作。

我目前正在查看文档......但是，从文档中看，我正在做的事情似乎是正确的（下面是文档中的示例）：

forward_prop_step

这是文档扫描：

def oneStep(u_tm4, u_t, x_tm3, x_tm1, y_tm1, W, W_in_1, W_in_2,  W_feedback, W_out):

    x_t = T.tanh(theano.dot(x_tm1, W) + \
                 theano.dot(u_t,   W_in_1) + \
                 theano.dot(u_tm4, W_in_2) + \
                 theano.dot(y_tm1, W_feedback))
    y_t = theano.dot(x_tm3, W_out)

    return [x_t, y_t]

该功能的返回是：＆＃39; [x_t，y_t]＆＃39;而W = T.matrix() W_in_1 = T.matrix() W_in_2 = T.matrix() W_feedback = T.matrix() W_out = T.matrix() u = T.matrix() # it is a sequence of vectors x0 = T.matrix() # initial state of x has to be a matrix, since # it has to cover x[-3] y0 = T.vector() # y0 is just a vector since scan has only to provide # y[-1] ([x_vals, y_vals], updates) = theano.scan(fn=oneStep, sequences=dict(input=u, taps=[-4,-0]), outputs_info=[dict(initial=x0, taps=[-3,-1]), y0], non_sequences=[W, W_in_1, W_in_2, W_feedback, W_out], strict=True) # for second input y, scan adds -1 in output_taps by default是outputs_info ...

在我的实现中，函数的返回是：[dict(initial=x0, taps=[-3,-1]), y0]而[o_t,s_t]是outputs_info ......这是有道理的，因为我没有理由将输出传递给功能......

Answer 1

在为NLP任务应用RNN时，我遇到了完全相同的问题。发生此错误的原因是x_t函数的forward_prop_step参数的类型，因为迭代通过ivector x，所以标量。

这里的解决方案是使用向量。例如，x_tv是一个在x_t索引处全部为零和1的向量。

def forward_prop_step(x_t, s_t_prev, U, V, W):
    x_tv = T.eye(1, m=input_size, k=x_t)[0]
    s_t = T.tanh(U.dot(x_tv) + V.dot(s_t_prev))
    o_t = T.tanh(W.dot(s_t))
    return [o_t, s_t]

Answer 2

尝试以下方法？请注意与(self.hidden_dim, )和(self.hidden_dim)

的区别

outputs_info=[None, dict(initial=T.zeros((self.hidden_dim, )))],

Theano循环出错：Outputs_info dim和fn输出扫描中的暗淡不匹配

2 个答案: