Question

我有一个奇怪的错误，我在Theano编译扫描操作符时无法理解。当outputs_info被初始化，最后一个维度等于1时，我收到此错误：

TypeError: ('The following error happened while compiling the node', forall_inplace,cpu,
scan_fn}(TensorConstant{4}, IncSubtensor{InplaceSet;:int64:}.0, <TensorType(float32, vector)>), 
'\n', "Inconsistency in the inner graph of scan 'scan_fn' : an input and an output are 
associated with the same recurrent state and should have the same type but have type 
'TensorType(float32, (True,))' and 'TensorType(float32, vector)' respectively.")

如果此尺寸设置为大于1，则我不会收到任何错误。

这个错误发生在gpu和cpu目标上，其中theano为0.7,0.8.0和0.8.2。

以下是重现错误的一段代码：

import theano
import theano.tensor as T
import numpy as np

def rec_fun( prev_output, bias):        
    return prev_output + bias

n_steps = 4

# with state_size>1, compilation runs smoothly
state_size = 2

bias = theano.shared(np.ones((state_size),dtype=theano.config.floatX))
(outputs, updates) = theano.scan( fn=rec_fun,
                              sequences=[],
                              outputs_info=T.zeros([state_size,]),
                              non_sequences=[bias],
                              n_steps=n_steps
                              )
print outputs.eval()

# with state_size==1, compilation fails
state_size = 1

bias = theano.shared(np.ones((state_size),dtype=theano.config.floatX))
(outputs, updates) = theano.scan( fn=rec_fun,
                              sequences=[],
                              outputs_info=T.zeros([state_size,]),
                              non_sequences=[bias],
                              n_steps=n_steps
                              )
# compilation fails here
print outputs.eval()

根据＆＃34; state_size＆＃34;编译因此具有不同的行为。是否有解决方法来处理case_size == 1和state_size＆gt; 1？

Answer 1

更改

outputs_info=T.zeros([state_size,])

到

outputs_info=T.zeros_like(bias)

使其适用于state_size == 1。

的情况

次要解释和不同解决方案

所以我注意到这两种情况之间的这种重要区别。在两种情况下，在偏见声明行之后添加这些代码行。

bias = ....
print bias.broadcastable
print T.zeros([state_size,]).broadcastable

结果

代码第一种情况

(False,)
(False,)

对于第二种似乎崩溃的案例

(False,)
(True,)

所以发生的事情是，当你添加相同尺寸的两个张量（偏差和T.zeros）但具有不同的可广播模式时，结果继承的模式是来自偏差的模式。这最终导致了来自theano的错误识别，他们不是同一类型。

T.zeros_like之所以有效，是因为它使用bias变量来生成零张量。

解决问题的另一种方法是更改广播模式

outputs_info=T.patternbroadcast(T.zeros([state_size,]), (False,)),

当输出的最后一维等于一

1 个答案: