Question

当我尝试运行LSTM程序（对于可变长度输入）时，我收到以下错误。

TypeError：扫描'scan_fn'内图中的不一致：a 输入和输出与相同的循环状态相关联应该具有相同的类型，但类型为'TensorType（float64，col）'和分别为'TensorType（float64，matrix）'。

我的程序基于imdb情感分析问题的LSTM示例，如下所示：http://deeplearning.net/tutorial/lstm.html。我的数据不是imdb，而是传感器数据。

我分享了我的源代码：lstm_var_length.py和数据： data.npz 。（点击文件）

从上面的错误和一些谷歌搜索让我明白我的函数中的向量/矩阵维度有一些问题。以下是出现此问题的函数定义：

def lstm_layer(shared_params, input_ex, options):
"""
LSTM Layer implementation. (Variable Length inputs)

Parameters
----------
shared_params: shared model parameters W, U, b etc
input_ex: input example (say dimension: 36 x 100 i.e 36 features and 100 time units)
options: Neural Network model options

Output / returns
----------------
output of each lstm cell [h_0, h_1, ..... , h_t]
"""

def slice(param, slice_no, height):
    return param[slice_no*height : (slice_no+1)*height, :]

def cell(wxb, ht_1, ct_1):
    pre_activation = tensor.dot(shared_params['U'], ht_1)
    pre_activation += wxb

    height = options['hidden_dim']
    ft = tensor.nnet.sigmoid(slice(pre_activation, 0, height))
    it = tensor.nnet.sigmoid(slice(pre_activation, 1, height))
    c_t = tensor.tanh(slice(pre_activation, 2, height))
    ot = tensor.nnet.sigmoid(slice(pre_activation, 3, height))

    ct = ft * ct_1 + it * c_t
    ht = ot * tensor.tanh(ct)

    return ht, ct

wxb = tensor.dot(shared_params['W'], input_ex) + shared_params['b']
num_frames = input_ex.shape[1]
result, updates = theano.scan(cell,
                              sequences=[wxb.transpose()],
                              outputs_info=[tensor.alloc(numpy.asarray(0., dtype=floatX),
                                                         options['hidden_dim'], 1),
                                            tensor.alloc(numpy.asarray(0., dtype=floatX),
                                                         options['hidden_dim'], 1)],
                              n_steps=num_frames)

return result[0]  # only ht is needed


def build_model(shared_params, options):
"""
Build the complete neural network model and return the symbolic variables

Parameters
----------
shared_params: shared, model parameters W, U, b etc
options: Neural Network model options

return
------
x, y, f_pred_prob, f_pred, cost
"""

x = tensor.matrix(name='x', dtype=floatX)
y = tensor.iscalar(name='y') # tensor.vector(name='y', dtype=floatX)

num_frames = x.shape[1]
# lstm outputs from each cell
lstm_result = lstm_layer(shared_params, x, options)
# mean pool from the lstm cell outputs
pool_result = lstm_result.sum(axis=1)/(1. * num_frames)
# Softmax / Logistic Regression
pred = tensor.nnet.softmax(tensor.dot(shared_params['softmax_W'], pool_result) +
                           shared_params['softmax_b'])
# predicted probability function
theano.printing.debugprint(pred)
f_pred_prob = theano.function([x], pred, name='f_pred_prob', mode='DebugMode') # 'DebugMode' <-- Problem seems to occur at this point
# predicted class
f_pred = theano.function([x], pred.argmax(axis=0), name='f_pred')
# cost of the model: -ve log likelihood
offset = 1e-8   # an offset to prevent log(0)
cost = -tensor.log(pred[y-1, 0] + offset)    # y = 1,2,...n but indexing is 0,1,..(n-1)

return x, y, f_pred_prob, f_pred, cost

上述错误是在尝试编译 f_pred_prob theano函数时引起的。

异常和调用堆栈如下：

File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 450, in 
    main()
  File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 447, in main
  train_lstm(model_options, train, valid)
 File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 314, in train_lstm
  (x, y, f_pred_prob, f_pred, cost) = build_model(shared_params, options)
File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 95, in build_model
  f_pred_prob = theano.function([x], pred, name='f_pred_prob', mode='DebugMode') # 'DebugMode'
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 320, in function
  output_keys=output_keys)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 479, in pfunc
  output_keys=output_keys)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1777, in orig_function
  defaults)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/debugmode.py", line 2571, in create
  storage_map=storage_map)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 690, in make_thunk
  storage_map=storage_map)[:3]
File "/usr/local/lib/python2.7/dist-packages/theano/compile/debugmode.py", line 1809, in make_all
  no_recycling)
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 730, in make_thunk
  self.validate_inner_graph()
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 249, in validate_inner_graph
  (self.name, type_input, type_output))
TypeError: Inconsistency in the inner graph of scan 'scan_fn' : an input and an output are associated with the same recurrent state and should have the same type but have type 'TensorType(float64, col)' and 'TensorType(float64, matrix)' respectively.

我已经进行了一周的所有调试，但找不到问题。我怀疑theano.scan中outputs_info的初始化是个问题但是当我删除第二个维度（1）时，我甚至在获得 f_pred_prob 函数之前就在切片函数中出现错误（在 lstm_result附近） ）。我不确定问题出在哪里。

通过将数据文件放在与python源文件相同的目录中来简单执行此程序可以重新创建此问题。

请帮帮我。

谢谢＆amp;问候， inblueswithu

Answer 1

使用

outputs_info=[tensor.unbroadcast(tensor.alloc(numpy.asarray(0., dtype=floatX),
                                              options['hidden_dim'], 1),1),
              tensor.unbroadcast(tensor.alloc(numpy.asarray(0., dtype=floatX),
                                              options['hidden_dim'], 1),1)]

而不是原来的outputs_info。

这是因为tensor.alloc(numpy.asarray(0., dtype=floatX),options['hidden_dim'], 1)的第二个暗淡是1，然后theano自动使其可播放，并将张量变量包装为col而不是matrix。这是错误消息中的'TensorType(float64, col)'

TypeError: Inconsistency in the inner graph of scan 'scan_fn' : an input and an output are associated with the same recurrent state and should have the same type but have type 'TensorType(float64, col)' and 'TensorType(float64, matrix)' respectively.

并theano.unbroadcast避免了这个问题。

Answer 2

我想，我发现了这个问题。我不得不重新检查矩阵的所有维度。我仍然需要仔细检查我的代码。一旦我完成，我就会把新代码。

三江源。

TypeError：扫描'scan_fn'的内图中的不一致....'TensorType（float64，col）'和'TensorType（float64，matrix）'

2 个答案: