以下是RBFNN的theano变量和函数定义的片段。
# Theano Tensor Definition
self.X = T.dmatrix('X')
self.y = T.dmatrix('y')
self.centers = T.dmatrix('centers')
self.sigmas = T.dvector('sigmas')
self.W = theano.shared(np.zeros((self.n_centers, self.n_classes)), 'W')
self.b = theano.shared(np.zeros((self.n_classes,)), 'b')
# Build Graph
self.phi = T.exp(-T.sum(T.square(self.X[:, np.newaxis, :] - self.centers[np.newaxis, :, :]), axis=-1) / (
2 * T.square(self.sigmas)))
self.prob = T.mul(self.phi, self.W) + self.b
self.pred = T.argmax(self.prob, axis=1)
self.loss = T.mean(T.sum(T.square(self.y - self.prob), axis=1))
self.W_grad = T.grad(self.loss, self.W)
self.b_grad = T.grad(self.loss, self.b)
self.updates = [(self.W, self.W - self.learning_rate * self.W_grad),
(self.b, self.b - self.learning_rate * self.b_grad)]
# Build Function
self.one_step = theano.function([self.X, self.y, self.centers, self.sigmas], [self.loss], updates=self.updates)
self.compute_prob = theano.function([self.X, self.centers, self.sigmas], [self.prob])
self.compute_pred = theano.function([self.X, self.centers, self.sigmas], [self.pred])
然后我提供数据
for i in xrange(max_iter):
losses = []
for batch in xrange(X.shape[0] / n_batch):
losses.append(self.one_step(X[batch * n_batch:(batch + 1) * n_batch, :], y[batch * n_batch:(batch + 1) * n_batch, :], centers, sigmas)
其中X,y是列车数据,中心和sigma是kmeans中心和每个中心的标准。
最终引发错误
ValueError: Input dimension mis-match. (input[0].shape[0] = 50, input[2].shape[0] = 10)
Apply node that caused the error: Elemwise{Composite{(i0 - ((i1 * i2) + i3))}}(y, Elemwise{Composite{exp(((i0 * i1) / i2))}}[(0, 1)].0, W, InplaceDimShuffle{x,0}.0)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix), TensorType(float64, matrix), TensorType(float64, row)]
Inputs shapes: [(50, 2), (50, 10), (10, 2), (1, 2)]
Inputs strides: [(16, 8), (80, 8), (16, 8), (16, 8)]
Inputs values: ['not shown', 'not shown', 'not shown', array([[ 0., 0.]])]
Debugprint of the apply node:
Elemwise{Composite{(i0 - ((i1 * i2) + i3))}} [@A] <TensorType(float64, matrix)> ''
|y [@B] <TensorType(float64, matrix)>
|Elemwise{Composite{exp(((i0 * i1) / i2))}}[(0, 1)] [@C] <TensorType(float64, matrix)> ''
| |TensorConstant{(1, 1) of -0.5} [@D] <TensorType(float64, (True, True))>
| |Sum{axis=[2], acc_dtype=float64} [@E] <TensorType(float64, matrix)> ''
| | |Elemwise{Composite{sqr((i0 - i1))}} [@F] <TensorType(float64, 3D)> ''
| | |InplaceDimShuffle{0,x,1} [@G] <TensorType(float64, (False, True, False))> ''
| | | |X [@H] <TensorType(float64, matrix)>
| | |InplaceDimShuffle{x,0,1} [@I] <TensorType(float64, (True, False, False))> ''
| | |centers [@J] <TensorType(float64, matrix)>
| |Elemwise{sqr,no_inplace} [@K] <TensorType(float64, row)> ''
| |InplaceDimShuffle{x,0} [@L] <TensorType(float64, row)> ''
| |sigmas [@M] <TensorType(float64, vector)>
|W [@N] <TensorType(float64, matrix)>
|InplaceDimShuffle{x,0} [@O] <TensorType(float64, row)> ''
|b [@P] <TensorType(float64, vector)>
Storage map footprint:
- InplaceDimShuffle{x,0}.0, Shape: (1, 2), ElemSize: 8 Byte(s), TotalSize: 16 Byte(s)
- TensorConstant{(1,) of 0.001}, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- Elemwise{Composite{exp(((i0 * i1) / i2))}}[(0, 1)].0, Shape: (50, 10), ElemSize: 8 Byte(s), TotalSize: 4000 Byte(s)
- X, Shape: (50, 4), ElemSize: 8 Byte(s), TotalSize: 1600 Byte(s)
- y, Shape: (50, 2), ElemSize: 8 Byte(s), TotalSize: 800 Byte(s)
- centers, Shape: (10, 4), ElemSize: 8 Byte(s), TotalSize: 320 Byte(s)
- sigmas, Shape: (10,), ElemSize: 8 Byte(s), TotalSize: 80 Byte(s)
- b, Shape: (2,), ElemSize: 8 Byte(s), TotalSize: 16 Byte(s)
- TensorConstant{(1, 1) of -0.002}, Shape: (1, 1), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- TensorConstant{(1,) of -2.0}, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- TensorConstant{(1, 1) of -0.5}, Shape: (1, 1), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- W, Shape: (10, 2), ElemSize: 8 Byte(s), TotalSize: 160 Byte(s)
self.X和self.center的形状是[50,newaxis,4]和[newaxis,10,4]吗?为什么事实证明[50,4]和[10,4]?你能告诉我怎么解决吗?