Question

我在使用Theano扫描功能和以下代码时遇到了一些问题：

def lstm_layer(tparams, options, trng, prefix='lstm'):

def _slice(_x, n, dim):
    if _x.ndim == 3:
        return _x[:, :, n * dim:(n + 1) * dim]
    return _x[:, n * dim:(n + 1) * dim]

def _step(sample_, h_, c_):
    theano.printing.debugprint(sample_,print_type=True)
    emb = tparams['Wemb'][sample_]
    x_ = tensor.dot(emb[None,:], tparams[_p(prefix, 'W')]) + tparams[_p(prefix, 'b')]
    preact = tensor.dot(h_, tparams[_p(prefix, 'U')])
    preact += x_

    i = tensor.nnet.sigmoid(_slice(preact, 0, options['dim_proj']))
    f = tensor.nnet.sigmoid(_slice(preact, 1, options['dim_proj']))
    o = tensor.nnet.sigmoid(_slice(preact, 2, options['dim_proj']))
    c = tensor.tanh(_slice(preact, 3, options['dim_proj']))

    c = f * c_ + i * c
    h = o * tensor.tanh(c)

    pred = tensor.nnet.softmax(tensor.dot(h, tparams['U']) + tparams['b'])
    rand = trng.multinomial(n=1, pvals=pred)
    sample = tensor.argmax(rand[0], axis=0)
    return sample, h, c

start = tensor.scalar('start', dtype='int64')
dim_proj = options['dim_proj']
nsteps = options['seq_length']
rval, updates = theano.scan(_step,
                            outputs_info=[start,
                                          tensor.alloc(numpy_floatX(0.),
                                                       1,
                                                       dim_proj),
                                          tensor.alloc(numpy_floatX(0.),
                                                       1,
                                                       dim_proj)],
                            name=_p(prefix, '_layers'),
                            n_steps=2)
return rval[0], start

正如您所看到的，变量start是一个整数，它在每次调用step_之后获得一个新值，我希望在任意步数n_steps之后得到其值的序列。如果我用n_steps = 1运行代码，一切正常。但是，对于n_steps＆gt; 1，我收到此错误：

TypeError：无法将Type TensorType（float64,3D）（变量IncSubtensor {Set;：int64：}。0）转换为Type TensorType（float64，（False，True，False））。您可以尝试将IncSubtensor {Set;：int64：}。0手动转换为TensorType（float64，（False，True，False））。

我无法得到它的来源，因为我的变量都不是3D张量（我已经使用theano.printing.debugprinting进行了检查，并且h和c是预期的行并采样标量）。

你有任何线索吗？

由于

Answer 1

实际上我找到了解决问题的方法。我改变了这个

def _slice(_x, n, dim):
if _x.ndim == 3:
    return _x[:, :, n * dim:(n + 1) * dim]
return _x[:, n * dim:(n + 1) * dim]

由此

    def _slice(_x, n, dim):
    if _x.ndim == 3:
        return _x[:, :, n * dim:(n + 1) * dim]
    if _x.ndim == 2:
        return _x[:, n * dim:(n + 1) * dim]
    return _x[n * dim:(n + 1) * dim]

和这个

x_ = tensor.dot(emb[None,:], tparams[_p(prefix, 'W')]) + tparams[_p(prefix, 'b')]

由此

    x_ = tensor.dot(emb, tparams[_p(prefix, 'W')]) + tparams[_p(prefix, 'b')]

这会使x_, h_ and c_ theano向量而不是之前的行并删除错误（虽然我不确定为什么）。

我当然也更新了对scan

的调用

    rval, updates = theano.scan(_step,
                            outputs_info=[start, tensor.alloc(numpy_floatX(0.),
                                                       dim_proj),
                                          tensor.alloc(numpy_floatX(0.),
                                                       dim_proj)],
                            name=_p(prefix, '_layers'),
                            n_steps=2)

Theano扫描函数问题 - TypeError：无法转换Type TensorType（float64,3D）

1 个答案: