Question

我正试图在Theano中用Newton的方法训练MLP。我已经设法找到了如何获得逆hessian（这很简单，只需要删除assert模块中的一些nlinalg。但是，当我尝试使用gradient_descent函数更新我的权重向量时，出现以下错误：

numpy.linalg.linalg.LinAlgError: Singular matrix
Apply node that caused the error: MatrixInverse(InplaceDimShuffle{x,0,1}.0)

显然，如果我的粗麻布是单数，它就不能倒置。但是权重是随机的，那么为什么它总是单数？我对矩阵计算知之甚少，但在我看来，我的过程中肯定存在缺陷。这是我的代码的相关部分：

def gradient_descent(error, weights, w_flat, learning_rate=0.1):
    # decrease flattened weights by the product of the inverse Hessian and the gradient
    hess = T.hessian(cost=error, wrt=[w_flat])
    grad = T.grad(error, wrt=[w_flat])
    mult = T.nlinalg.matrix_inverse(hess) * grad
    update = w_flat - learning_rate * mult

    return T.flatten(update)

#initialize weight matrices
w_h_flat = theano.shared(np.array(np.random.randn(11,6), dtype=theano.config.floatX).flatten())
w_hidden = w_h_flat.reshape((11,6))
w_o_flat = theano.shared(np.array(np.random.randn(7,1), dtype=theano.config.floatX).flatten())
w_output = w_o_flat.reshape((7,1))

# ... other mlp stuff, define the cost, etc ...

train = theano.function(inputs=[x, y], outputs=cost, updates=[(w_h_flat, gradient_descent(cost, w_hidden, w_h_flat)), (w_o_flat, gradient_descent(cost, w_output, w_o_flat))])

非常感谢任何关于我做错事的指示。

Hessian矩阵在Theano中总是奇异的

0 个答案: