我很难理解Optimiser
如何使用LearningRateScheduler
定义的自定义学习率。查看SGD
的来源,参数似乎仅受其自身参数的影响(例如,self.lr
):
def get_updates(self, loss, params):
grads = self.get_gradients(loss, params)
self.updates = [K.update_add(self.iterations, 1)]
lr = self.lr
if self.initial_decay > 0:
lr *= (1. / (1. + self.decay * K.cast(self.iterations,
K.dtype(self.decay))))
# momentum
shapes = [K.int_shape(p) for p in params]
moments = [K.zeros(shape) for shape in shapes]
self.weights = [self.iterations] + moments
for p, g, m in zip(params, grads, moments):
v = self.momentum * m - lr * g # velocity
self.updates.append(K.update(m, v))
if self.nesterov:
new_p = p + self.momentum * v - lr * g
else:
new_p = p + v
...
https://github.com/keras-team/keras/blob/master/keras/optimizers.py#L172
如果使用以下方式调用fit_generator
:
def step_decay(epoch):
initial_lrate = 0.01
drop = 0.5
epochs_drop = 10.0
lrate = initial_lrate * math.pow(drop, math.floor((1+epoch)/epochs_drop))
return lrate
lr_scheduler = LearningRateScheduler(step_decay)
model.fit_generator(
...
callbacks=[lr_scheduler],
...)
是用于初始化优化器的原始学习率,以某种方式覆盖?这似乎在一定程度上。我试过以下打印机:
class LearningRatePrinter(Callback):
def init(self):
super(LearningRatePrinter, self).init()
def on_epoch_begin(self, epoch, logs={}):
print('lr:', sess.run(optimizer.lr))
lr_printer = LearningRatePrinter()
并使用fit_generator()
致电callbacks=[lr_scheduler, lr_printer,]
。我得到的是lr: 0.01
每个时代。当然,lr_printer
打印成员变量lr
,但lr_scheduler
自定义学习率值是如何使用的?没有任何迹象表明它实际上已被使用。