我正在使用基于张量流的keras训练对象检测网络。
由于GPU内存较小,因此必须每隔几步更新一次参数以“扩展”批处理大小。 但是我的实现无法成功地融合到Mask Rcnn培训中,也无法在简单的cifar10分类中对普通的优化器进行比较,效果很差:
class custom_SGD(Optimizer):
def get_updates(self, loss, params):
"""Main params updates operation
"""
shapes = [K.int_shape(p) for p in params]
if self.nesterov:
sum_grads = [K.zeros(shape) for shape in shapes]
moments = [K.zeros(shape) for shape in shapes]
sum_moments = [K.zeros(shape) for shape in shapes]
# Current gradients
grads = self.get_gradients(loss, params)
self.updates = [K.update_add(self.iterations, 1)]
self.weights = [self.iterations] + moments
cond1 = K.equal(self.iterations % self.steps_per_update, 0)
# Learning rate decay
lr = self.lr
if self.initial_decay > 0:
lr = lr * (1. / (1. + self.decay * K.cast(self.iterations,
K.dtype(self.decay))))
if not self.nesterov:
for p, g, m,sm in zip(params, grads,moments, sum_moments):
v = self.momentum*m - lr*g
# updates sum_moments, moments
self.updates.append(K.update(sm, sm+v))
self.updates.append(K.switch(cond1,K.update(m,sm/float(self.steps_per_update)),m))
new_p = p + m
# Apply constraint
if getattr(p, 'constraint', None) is not None:
new_p = p.constraint(new_p)
self.updates.append(K.switch(cond1,K.update(p,new_p),p))
# Clear up container
self.updates.append(K.switch(cond1,K.update(sm,K.zeros_like(sm)),sm))
return self.updates
有人可以告诉我为什么吗。 预先感谢。