在theano

时间:2015-10-15 15:05:51

标签: python theano

我注意到在theano中,当一个人创建一个基于1D numpy数组的共享变量时,这会变成一个向量,但不是一行:

import theano.tensor as T
import theano, numpy

shared_vector = theano.shared(numpy.zeros((10,)))
print(shared_vector.type)
# TensorType(float64, vector)
print(shared_vector.broadcastable)
# (False,)

对于1xN矩阵也是如此,它变成矩阵但不是行:

shared_vector = theano.shared(numpy.zeros((1,10,)))
print(shared_vector.type)
# TensorType(float64, matrix)
print(shared_vector.broadcastable)
# (False, False)

当我想在1 X N行向量中添加M x N矩阵时,这很麻烦,因为共享向量在第一维中不可广播。首先,这不起作用:

row = T.row('row')
mat=T.matrix('matrix')
f=theano.function(
    [],
    mat + row,
    givens={
        mat: numpy.zeros((20,10), dtype=numpy.float32),
        row: numpy.zeros((10,), dtype=numpy.float32)
    },
    on_unused_input='ignore'
)

错误:

TypeError: Cannot convert Type TensorType(float32, vector) (of Variable <TensorType(float32, vector)>) into Type TensorType(float32, row). You can try to manually convert <TensorType(float32, vector)> into a TensorType(float32, row).

好的,很明显,我们无法将向量分配给行。不幸的是,这也不是很好:

row = T.matrix('row')
mat=T.matrix('matrix')
f=theano.function(
    [],
    mat + row,
    givens={
        mat: numpy.zeros((20,10), dtype=numpy.float32),
        row: numpy.zeros((1,10,), dtype=numpy.float32)
    },
    on_unused_input='ignore'
)
f()

错误:

ValueError: Input dimension mis-match. (input[0].shape[0] = 20, input[1].shape[0] = 1)
Apply node that caused the error: Elemwise{add,no_inplace}(<TensorType(float32, matrix)>, <TensorType(float32, matrix)>)
Inputs types: [TensorType(float32, matrix), TensorType(float32, matrix)]
Inputs shapes: [(20, 10), (1, 10)]
Inputs strides: [(40, 4), (40, 4)]
Inputs values: ['not shown', 'not shown']

Backtrace when the node is created:
  File "<ipython-input-55-0f03bee478ec>", line 5, in <module>
    mat + row,

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

因此我们不能仅使用1 x N矩阵作为行(因为1 x N矩阵的第一维不可广播)。

问题仍然存在,我们能做什么?如何创建一个row类型的共享变量,这样可以使用矩阵行添加进行广播?

2 个答案:

答案 0 :(得分:2)

使用reshape(1, N)的替代方法是将dimshuffle('x', 0)用作described in the documentation

以下是两种方法的演示:

import numpy
import theano

x = theano.shared(numpy.arange(10))
print x
print x.dimshuffle('x', 0).type
print x.dimshuffle(0, 'x').type
print x.reshape((1, x.shape[0])).type
print x.reshape((x.shape[0], 1)).type

f = theano.function([], outputs=[x, x.dimshuffle('x', 0), x.reshape((1, x.shape[0]))])
theano.printing.debugprint(f)

打印

<TensorType(int32, vector)>
TensorType(int32, row)
TensorType(int32, col)
TensorType(int32, row)
TensorType(int32, col)
DeepCopyOp [@A] ''   2
 |<TensorType(int32, vector)> [@B]
DeepCopyOp [@C] ''   4
 |InplaceDimShuffle{x,0} [@D] ''   1
   |<TensorType(int32, vector)> [@B]
DeepCopyOp [@E] ''   6
 |Reshape{2} [@F] ''   5
   |<TensorType(int32, vector)> [@B]
   |MakeVector{dtype='int64'} [@G] ''   3
     |TensorConstant{1} [@H]
     |Shape_i{0} [@I] ''   0
       |<TensorType(int32, vector)> [@B]

证明dimshuffle可能更可取,因为它涉及的工作量少于reshape

答案 1 :(得分:1)

我会用:

shared_row = theano.shared(numpy.zeros((1,10,)), broadcastable=(True, False))
print(shared_row.type)
# TensorType(float64, row)
print(shared_row.broadcastable)
(True, False)