为什么我的两个tensorflow变量不同步更新?

时间:2019-06-04 04:19:31

标签: python tensorflow keras keras-2

我正在尝试(正确或不正确)编写带有Tensorflow后端的Keras SGD优化器的修改形式。这个想法是使用较小的学习率在指定的时间段安排SGD的“重新启动”,但又不保存和重新加载模型。为了使学习速率的衰减速率像这种重启一样,我不仅要跟踪总迭代次数,还要跟踪自上次“重启”以来的迭代次数。

因此,在创建SGD优化器对象(即SGD_VAR)后,我将“自上次重启以来的迭代”计数器(self.iteration_ref)初始化为0,就像self.iterations初始化为0一样。然后,每次迭代时,除非有重置,否则我将每个计数器增加1,在这种情况下,我会将迭代计数器(self.iterations_ref)重置为1。此处显示了我使用的代码(它继承自Keras的SGD类,并且进行少量修改:

class SGD_VAR(SGD):
"""Stochastic gradient descent optimizer.

Includes support for momentum,
learning rate decay, and Nesterov momentum.

# Arguments
    lr: float >= 0. Learning rate.
    momentum: float >= 0. Parameter that accelerates SGD
        in the relevant direction and dampens oscillations.
    decay: float >= 0. Learning rate decay over each update.
    nesterov: boolean. Whether to apply Nesterov momentum.
"""

def __init__(self, lr=0.05, momentum=0., decay=0.,
             nesterov=False, lr_dict = {},
             batches_per_epoch = 1562,
             **kwargs):

    super(SGD_VAR, self).__init__(lr, momentum, decay,
                                  nesterov, **kwargs)
    if lr_dict == {}:
        lr_dict = {0:lr}

    self.lr_dict = lr_dict
    self.batches_per_epoch = batches_per_epoch

    with K.name_scope(self.__class__.__name__):
        # Here is where I initialize *MY* iterations counter
        self.iterations_ref = K.variable(0, dtype='int64', 
                                         name='iterations_ref')
        self.new_lr = K.variable(lr, name='new_lr')

@interfaces.legacy_get_updates_support
def get_updates(self, loss, params):


    def lr_stepper(iteration, lr):
        ''' Wrapped python method used by tensor 
            to determine desired learning rate'''

        # Change the learning rate when specified 
        # in lr_dict(dict of epochs: learning rates)
        for x in self.lr_dict:
            temp = tf.Variable((x-1) * self.batches_per_epoch, 
                               dtype=iteration.dtype)
            if tf.equal(temp, iteration):
                return tf.constant(self.lr_dict[x], dtype=lr.dtype)

        return lr

    # NOTE: K.update_add and K.update 
            return tf.assign_add and tf.assign, respectively
    self.updates = [K.update_add(self.iterations, 1)]


    # Key lines to change self.lr
    new_lr = tf.contrib.eager.py_func(func=lr_stepper,
                                      inp=[self.iterations, self.lr], 
                                      Tout=tf.float32)

    # Note: self.lr != new_lr indicates a RESET has occurred
    new_iter_ref = tf.cond(tf.math.equal(self.lr,new_lr),
                           lambda: K.update_add(self.iterations_ref, 1),
                           lambda: K.update(self.iterations_ref, 1))
    self.updates.append(K.update(self.lr, new_lr))
    self.updates.append(new_iter_ref)

    # Temporary code to debug output
    self.iterations = tf.Print(self.lr,
             [self.iterations,self.iterations_ref, self.lr],
                               message="\n Debug Vals:" )

我使用tf.Print打印出self.iterationsself.iterations_refself.lr。每个数字都放在方括号中。我曾期望tf.Print显示self.iterationsself.iterations_ref彼此相等(不包括任何重置的影响),但是我看到的是它们保持了差异的1-即我看到的输出是:

 Debug Vals:[1][0][0.1]
 Debug Vals:[2][1][0.1]
 Debug Vals:[3][2][0.1]
 Debug Vals:[4][3][0.1]

...

我曾期望:

 Debug Vals:[1][1][0.1]
 Debug Vals:[2][2][0.1]
 Debug Vals:[3][3][0.1]
 Debug Vals:[4][4][0.1]

...

这是为什么? (注意:我正在使用Keras 2.2.4和tensorflow 1.8)

0 个答案:

没有答案