Theano dmatrix包含newaxis raise dimension mismatch

时间:2016-01-21 04:24:35

标签: python machine-learning theano

以下是RBFNN的theano变量和函数定义的片段。

# Theano Tensor Definition
    self.X = T.dmatrix('X')
    self.y = T.dmatrix('y')
    self.centers = T.dmatrix('centers')
    self.sigmas = T.dvector('sigmas')

    self.W = theano.shared(np.zeros((self.n_centers, self.n_classes)), 'W')
    self.b = theano.shared(np.zeros((self.n_classes,)), 'b')

    # Build Graph
    self.phi = T.exp(-T.sum(T.square(self.X[:, np.newaxis, :] - self.centers[np.newaxis, :, :]), axis=-1) / (
        2 * T.square(self.sigmas)))
    self.prob = T.mul(self.phi, self.W) + self.b
    self.pred = T.argmax(self.prob, axis=1)
    self.loss = T.mean(T.sum(T.square(self.y - self.prob), axis=1))
    self.W_grad = T.grad(self.loss, self.W)
    self.b_grad = T.grad(self.loss, self.b)
    self.updates = [(self.W, self.W - self.learning_rate * self.W_grad),
                    (self.b, self.b - self.learning_rate * self.b_grad)]

    # Build Function
    self.one_step = theano.function([self.X, self.y, self.centers, self.sigmas], [self.loss], updates=self.updates)
    self.compute_prob = theano.function([self.X, self.centers, self.sigmas], [self.prob])
    self.compute_pred = theano.function([self.X, self.centers, self.sigmas], [self.pred])

然后我提供数据

for i in xrange(max_iter):
    losses = []
    for batch in xrange(X.shape[0] / n_batch):
        losses.append(self.one_step(X[batch * n_batch:(batch + 1) * n_batch, :], y[batch * n_batch:(batch + 1) * n_batch, :], centers, sigmas)

其中X,y是列车数据,中心和sigma是kmeans中心和每个中心的标准。

最终引发错误

ValueError: Input dimension mis-match. (input[0].shape[0] = 50, input[2].shape[0] = 10)
Apply node that caused the error: Elemwise{Composite{(i0 - ((i1 * i2) + i3))}}(y, Elemwise{Composite{exp(((i0 * i1) / i2))}}[(0, 1)].0, W, InplaceDimShuffle{x,0}.0)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix), TensorType(float64, matrix), TensorType(float64, row)]
Inputs shapes: [(50, 2), (50, 10), (10, 2), (1, 2)]
Inputs strides: [(16, 8), (80, 8), (16, 8), (16, 8)]
Inputs values: ['not shown', 'not shown', 'not shown', array([[ 0.,  0.]])]
Debugprint of the apply node: 
Elemwise{Composite{(i0 - ((i1 * i2) + i3))}} [@A] <TensorType(float64, matrix)> ''   
 |y [@B] <TensorType(float64, matrix)>
 |Elemwise{Composite{exp(((i0 * i1) / i2))}}[(0, 1)] [@C] <TensorType(float64, matrix)> ''   
 | |TensorConstant{(1, 1) of -0.5} [@D] <TensorType(float64, (True, True))>
 | |Sum{axis=[2], acc_dtype=float64} [@E] <TensorType(float64, matrix)> ''   
 | | |Elemwise{Composite{sqr((i0 - i1))}} [@F] <TensorType(float64, 3D)> ''   
 | |   |InplaceDimShuffle{0,x,1} [@G] <TensorType(float64, (False, True, False))> ''   
 | |   | |X [@H] <TensorType(float64, matrix)>
 | |   |InplaceDimShuffle{x,0,1} [@I] <TensorType(float64, (True, False, False))> ''   
 | |     |centers [@J] <TensorType(float64, matrix)>
 | |Elemwise{sqr,no_inplace} [@K] <TensorType(float64, row)> ''   
 |   |InplaceDimShuffle{x,0} [@L] <TensorType(float64, row)> ''   
 |     |sigmas [@M] <TensorType(float64, vector)>
 |W [@N] <TensorType(float64, matrix)>
 |InplaceDimShuffle{x,0} [@O] <TensorType(float64, row)> ''   
   |b [@P] <TensorType(float64, vector)>

Storage map footprint:
 - InplaceDimShuffle{x,0}.0, Shape: (1, 2), ElemSize: 8 Byte(s), TotalSize: 16 Byte(s)
 - TensorConstant{(1,) of 0.001}, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
 - Elemwise{Composite{exp(((i0 * i1) / i2))}}[(0, 1)].0, Shape: (50, 10), ElemSize: 8 Byte(s), TotalSize: 4000 Byte(s)
 - X, Shape: (50, 4), ElemSize: 8 Byte(s), TotalSize: 1600 Byte(s)
 - y, Shape: (50, 2), ElemSize: 8 Byte(s), TotalSize: 800 Byte(s)
 - centers, Shape: (10, 4), ElemSize: 8 Byte(s), TotalSize: 320 Byte(s)
 - sigmas, Shape: (10,), ElemSize: 8 Byte(s), TotalSize: 80 Byte(s)
 - b, Shape: (2,), ElemSize: 8 Byte(s), TotalSize: 16 Byte(s)
 - TensorConstant{(1, 1) of -0.002}, Shape: (1, 1), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
 - TensorConstant{(1,) of -2.0}, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
 - TensorConstant{(1, 1) of -0.5}, Shape: (1, 1), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
 - W, Shape: (10, 2), ElemSize: 8 Byte(s), TotalSize: 160 Byte(s)

self.X和self.center的形状是[50,newaxis,4]和[newaxis,10,4]吗?为什么事实证明[50,4]和[10,4]?你能告诉我怎么解决吗?

1 个答案:

答案 0 :(得分:0)

您可以在构建图表时使用test values尝试调试问题。