我正尝试增加学习速率的重新启动-即我自己的不太复杂的版本,<< Loschichilov and Hutter(https://arxiv.org/abs/1608.03983)的随机梯度下降与热启动,我的序列预测器CNN模型。
作为第一个实验,我的想法是让学习率从0.3开始,并在每个时期减少一半。然后,在每第15个时期将其恢复为0.3。
LR<-0.3
optimizer_sgd(lr=LR,nesterov = TRUE)
lr_schedule<-function(epoch) {
model_big$optimizer$lr=LR/(2 ^ (epoch%%15))
}
cb_lr<-callback_learning_rate_scheduler(lr_schedule)
model_big%>%
compile(loss = 'binary_crossentropy',
optimizer = 'sgd',metrics = 'accuracy')
history<-model_big%>%
fit(x=list(x1,x2,x3),y=y,epochs = 31,callbacks=c(cb_lr))
但是,出现以下错误:
Epoch 1/31
Error in py_call_impl(callable, dots$args, dots$keywords) :
RuntimeError: Evaluation error: unused argument (lr = 0.00999999977648258).
Detailed traceback:
File "/Users/d/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 1712, in fit
validation_steps=validation_steps)
File "/Users/d/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 1180, in _fit_loop
callbacks.on_epoch_begin(epoch)
File "/Users/d/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/callbacks.py", line 63, in on_epoch_begin
callback.on_epoch_begin(epoch, logs)
File "/Users/d/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/callbacks.py", line 611, in on_epoch_begin
lr = self.schedule(epoch, lr=lr)
File "/Library/Frameworks/R.framework/Versions/3.4/Resources/library/reticulate/python/rpytools/call.py", line 21, in python_function
raise RuntimeError(res[kErrorKey])
有人可以帮忙吗?
答案 0 :(得分:0)
正如GitHub Issue中的 skeydan 所指出的,学习率调度程序功能必须使用两个参数来定义,一个参数用于纪元(从0开始索引),另一个参数用于当前时间。学习率。该函数还必须返回新的学习率值。查看文档以了解更多详细信息。
我对这个问题的贡献是关于您最初打算在R Keras中实施带有热重启的SGD的初衷。在这里,我与社区分享了我的实现。
lr_max <- 1e-2
lr_min <- 1e-5
lr_t_0 <- 10
lr_t_mult <- 2
lr_t_i <- lr_t_0
lr_t_prev <- 0
LearningRateScheduler <- function(epoch, lr) {
lr_t_curr <- (epoch - lr_t_prev) %% lr_t_i
new_lr <- (lr_min
+ 1/2 * (lr_max - lr_min)
* (1 + cos(lr_t_curr / (lr_t_i - 1) * pi)))
if ((lr_t_curr + 1) == lr_t_i) {
lr_t_prev <<- epoch + 1
lr_t_i <<- lr_t_i * lr_t_mult
}
new_lr
}
可以通过以下循环轻松测试
epochs <- 150
lr <- numeric(length = epochs)
for (e in seq(1, epochs, by = 1)) {
lr[e] <- LearningRateScheduler(e - 1, NA)
}
结果与上述论文中的数字一致。