Question

这是一个非常简单的初学者Theano问题。

我尝试修改Deep Learning Tutorials提供的Logistic SGD代码，从单一学习率转换为特定维度的学习率。例如，如果我有3个输入维度，我想使用3种不同的学习率，每个维度一个。

原始相关代码是：

learning_rate = 0.1
x = T.matrix('x')
y = T.ivector('y')
classifier = LogisticRegression(input=x, n_in=3, n_out=2)
cost = classifier.negative_log_likelihood(y)

g_W = T.grad(cost=cost, wrt=classifier.W)
g_b = T.grad(cost=cost, wrt=classifier.b)

updates = [(classifier.W, classifier.W - learning_rate * g_W),
           (classifier.b, classifier.b - learning_rate * g_b)]

train_model = theano.function(inputs=[],
        outputs=cost,
        updates=updates,
        givens={
            x: minibatch_x,
            y: minibatch_y})

在numpy中，只需用一系列学习率替换标量学习率，并用渐变g_W和g_b进行逐元素乘法。在Theano这样做会产生错误：

'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?'

显然，我想念的是Theano。谁能开导我？

Answer 1

实际上，您需要用数组替换学习速率标量。你可以尝试例如以下内容：

learning_rate = theano.shared(np.array([0.1, 0.2, 0.05]))

可能需要根据渐变的形状进行转置，但基本上您已经说明了正确的方法，它应该使用共享变量。

Theano物流SGD与每个维度的学习率

1 个答案: