Question

我正在使用Theano随机梯度下降来解决最小化问题。在运行我的代码时，第一次迭代似乎有效，但过了一会儿，突然之间，优化参数（eta）变为NaN（以及衍生物g_eta）。它似乎是一个Theano技术问题，而不是我的代码中的一个错误，因为我已经用几种不同的方式检查了它。

任何人都知道这可能是什么原因？我的代码如下：

n_exp = 4
features = theano.shared(value=X_comb_I, name='features', borrow=True)

x = T.dmatrix()
y = T.ivector()

srng = RandomStreams()
rv_u = srng.uniform((64,n_exp))


eta = theano.shared(value=rv_u.eval(), name='eta', borrow=True)

ndotx = T.exp(T.dot(features, eta))
g = ndotx/T.reshape(T.repeat( T.sum(ndotx, axis=1), (n_exp), axis=0),[n_i,n_exp])
my_score_given_eta = T.sum((g*x),axis=1)

cost = T.mean(T.abs_(my_score_given_eta - y))

g_eta = T.grad(cost=cost, wrt=eta)

learning_rate = 0.5

updates = [(eta, eta - learning_rate * g_eta)]

train_set_x = theano.shared(value=score, name='train_set_x', borrow=True)
train_set_y = theano.shared(value=labels.astype(np.int32), name='train_set_y', borrow=True)

train = theano.function(inputs=[],
                 outputs=cost,
                 updates=updates, givens={x: train_set_x, y: train_set_y})

validate = theano.function(inputs=[],
                outputs=cost, givens={x: train_set_x, y: train_set_y})

train_monitor = []
val_monitor = []

n_epochs = 1000

for epoch in range(n_epochs):
    loss = train()
    train_monitor.append(validate())

    if epoch%2 == 0:
        print "Iteration: ", epoch
        print "Training error, validation error: ", train_monitor[-1] #,  val_monitor[-1]

谢谢！

Answer 1

你得到同样的问题但速度慢但学习速度慢的事实表明你的功能可能不稳定，在你开始SGD的地方爆炸。

尝试不同的起始值
调整您的成本函数以惩罚正在爆炸的恶劣区域
尝试不同的渐变下降方法

Theano随机梯度下降NaN输出

1 个答案: