Question

我正在研究如何在使用Tensorflow训练模型时实现学习率衰减的几个教程。我在一些例子中注意到它实现如下。例如，在本教程中how to model language using recurrent neural networks：

self._lr = tf.Variable(0.0, trainable=False)
optimizer = tf.train.GradientDescentOptimizer(self._lr)
# Some other code...

self._new_lr = tf.placeholder(
    tf.float32, shape=[], name="new_learning_rate")
self._lr_update = tf.assign(self._lr, self._new_lr)

通过调用以下方法更新学习率：

def assign_lr(self, session, lr_value):
    session.run(self._lr_update, feed_dict={self._new_lr: lr_value})

然后，在训练期间：

for i in range(config.max_max_epoch):
    lr_decay = config.lr_decay ** max(i + 1 - config.max_epoch, 0.0)
    m.assign_lr(session, config.learning_rate * lr_decay)

为什么这样实现？乍一看似乎有点太复杂了。为什么这样的事情不起作用？

self._lr = tf.placeholder(
    tf.float32, shape=[], name="new_learning_rate")
optimizer = tf.train.GradientDescentOptimizer(self._lr)
# Some other code...

Answer 1

每次调用时，

('hello') -> 'hello'基本上会创建一个具有修改学习速率的新优化器。但要明白学习率不是优化器中唯一的参数，并且还会有其他参数，如动量等，这样做重新初始化。

因此，正确的调度将保持其他参数不变，只需通过将学习速率存储为optimizer = tf.train.GradientDescentOptimizer(self._lr)来修改教学中的学习速率，并通过调用函数{{1来更新}}

希望这可以回答您的问题。

如何在Tensorflow教程中执行学习率的更新？

1 个答案: