Question

我正在开发一个非常简单的程序，它使用theano.scan函数循环遍历一个向量数组（最后，我打算为我的程序开发一个LSTM层）。问题是我在编译函数时总是得到错误Non-unit on shape of a broadcastable dimension。我相信updates param是原因，因为只要我把它放在那里编译一个函数，就会发生错误。这是代码：

import theano
import theano.tensor as T
from utils import *
import numpy as np

class LSTM:
    def __init__(self, X, in_size, out_size):
        self.X = X
        self.in_size = in_size
        self.out_size = out_size
        self.W_x = init_weights((in_size, out_size), "W_x")

        def _active(x, pre_h):
            x = T.reshape(x, (1, in_size))
            pre_h = T.dot(x, self.W_x)
            return pre_h

        h, updates = theano.scan(_active, sequences=X,
            outputs_info = [T.alloc(floatX(0.), 1, out_size)])

        self.activation = h

if __name__ == "__main__":
    X = T.matrix('X')
    in_size = 2
    out_size = 4
    lstm = LSTM(X, in_size, out_size)
    value = lstm.activation
    cost = T.mean(value)
    params = [lstm.W_x]

    updates = []
    for p in params:
        gp = T.grad(cost, p)
        updates.append((p, p - 0.1*gp))

    f = theano.function([X], outputs = cost, updates=updates)

    test = f(np.random.rand(10, in_size))
    print test

在代码中，我使用了从utils.py加载的一些函数，如下所示：

#pylint: skip-file
import numpy as np
import theano
import theano.tensor as T

def floatX(X):
    return np.asarray(X, dtype=theano.config.floatX)

def init_weights(shape, name):
    return theano.shared(floatX(np.random.randn(*shape) * 0.1), name)

def init_gradws(shape, name):
    return theano.shared(floatX(np.zeros(shape)), name)

def init_bias(size, name):
    return theano.shared(floatX(np.zeros((size,))), name)

我一直在搜索，但找不到任何解决方案。另外，我看不出我的代码有任何问题。如果我不使用theano.scan，代码将运行得很好。

我能在代码中看到任何问题吗？你对解决这个问题有什么建议吗？

提前谢谢

theano.scan：可广播维度上形状的非单位值

0 个答案: