Question

我试图在theano中实现最小的递归神经网络示例。我希望以下python脚本能够打印一个10×20的矩阵，表示隐藏的状态序列。

# import packages/functions
from theano import shared, scan, function, tensor as T
import numpy as np

# declare variables
X = T.dmatrix("X")
Wx = shared(np.random.uniform(-1.0, 1.0, (10, 20)))
Wh = shared(np.random.uniform(-1.0, 1.0, (20, 20)))
b = shared(np.random.uniform(-1.0, 1.0, (1, 20)))

# define recurrence function
def recurrence(x_t, h_tm1):
    return T.nnet.sigmoid(T.dot(h_tm1, Wh) + T.dot(x_t, Wx) + b)

# compute hidden state sequence with scan
ht, _ = scan(fn = recurrence, sequences = X,
             outputs_info = np.zeros((1, 20)))

# define function producing hidden state sequence
fn = function([X], ht)reshape((1,3))

# test function
print fn(np.eye(10))

相反，它返回错误： TypeError: Cannot convert Type TensorType(float64, 3D) (of Variable IncSubtensor{Set;:int64:}.0) into Type TensorType(float64, (False, True, False)). You can try to manually convert IncSubtensor{Set;:int64:}.0 into a TensorType(float64, (False, True, False)).

这一点特别令人困惑，因为据我所知，我的变量都不是3张张量的！

Answer 1

问题代码的代码中存在语法错误：fn =末尾的b无效且似乎已添加错误。当该行的这个元素被简单删除时，代码就会运行。本答案的其余部分假定编辑符合问题作者的意图。

我没有复制所述的错误。这可能是因为我正在使用Theano的最新版本。在最新版本的代码中，这种情况的错误消息很可能已经改变了。但是，上面代码中的语法错误提示了另一种可能性：问题代码实际上并不是产生粘贴到问题中的错误的代码。

使用已编辑的代码和最新版本的Theano，我收到错误

TypeError :(＆＃39;编译节点时出现以下错误＆＃39;，forall_inplace，cpu，scan_fn}（Shape_i {0} .0，Subtensor {int64：int64：int8} .0，IncSubtensor {InplaceSet ;：int64：}。0 ,,,），＆＃39; \ n＆＃39;，＆＃34;扫描内图中的不一致＆＃39; scan_fn＆＃39;：输入和输出与相同的循环状态，应该具有相同的类型，但分别具有类型＆＃39; TensorType（float64，row）＆＃39; TensorType（float64，matrix）＆＃39;＆lt;＆＃39;）< / p>

这与问题的错误类似，但是指的是矩阵和行向量之间的不匹配;没有提到3张量。

避免此错误的最简单更改是更改b = shared(np.random.uniform(-1.0, 1.0, (1, 20)))共享变量的形状。

而不是

b = shared(np.random.uniform(-1.0, 1.0, (20,)))

使用

h_tm1

我还建议对outputs_info = np.zeros((1, 20))的初始值执行相同的操作。

而不是

outputs_info = np.zeros((20,))

使用

{{1}}

Answer 2

问题是因为theano将output_info包装成不合适的变量。例如，outputs_info = np.zeros((3, 20))将被包含在TensorType(float64, matrix)中。但是对于昏暗的等于1，theano自动地调暗昏暗，所以outputs_info = np.zeros((1, 20))将是TensorType(float64, row)，因为第一个暗淡的是可以播放的。

解决方案是outputs_info = T.unbroadcast(T.zeros((1,20)), 0))，这可以确保它包装为矩阵。

# import packages/functions
from theano import shared, scan, function, config, tensor as T
import numpy as np

# declare variables
X = T.tensor3("X")
Wx = shared(np.asarray(np.random.uniform(-1.0, 1.0, (10, 20)), dtype=config.floatX))
Wh = shared(np.asarray(np.random.uniform(-1.0, 1.0, (20, 20)), dtype=config.floatX))
b = shared(np.asarray(np.random.uniform(-1.0, 1.0, (1, 20)), dtype=config.floatX))

# define recurrence function
def recurrence(x_t, h_tm1):
    print h_tm1.type, x_t.type
    return T.nnet.sigmoid(T.dot(h_tm1, Wh) + T.dot(x_t, Wx) + b)

# compute hidden state sequence with scan
ht, _ = scan(fn = recurrence, sequences = X,
             outputs_info = T.unbroadcast(T.zeros((1,20)), 0))

# define function producing hidden state sequence
fn = function([X], ht)

# test function
print fn(np.eye(10,10,dtype=config.floatX).reshape(10,1,10))

为什么这个最小的RNN代码会抛出一个从未使用过的类型的类型错误？

2 个答案: