Question

我正在尝试修改Keras的随机梯度下降类（SGD），以便可以在选定的时期为学习率分配新值。我是通过如下修改该类的... except tf.errors.ResourceExhaustedError as e: ...方法来完成此操作的：

get_updates

我所做的更改旨在调整第2、4和6次迭代的学习率，打印出一些调试信息，然后在10次迭代后停止。 @interfaces.legacy_get_updates_support def get_updates(self, loss, params): --------------- My Changes Start ---------------------------------- def lr_stepper(iteration, lr): ''' Wrapped python method used by tensor to determine desired learning rate''' # Change the learning rate at 2nd, 4th & 6th iteration for x in [2,4,6]: temp = tf.Variable(x, dtype=iteration.dtype) if tf.equal(temp, iteration): return tf.constant(0.01*x) # Temporary code to stop run after 10 iterations temp = tf.Variable(10, dtype=iteration.dtype) if tf.equal(temp, iteration): sys.exit() return lr # Key lines to change self.lr new_lr = tf.contrib.eager.py_func(func=lr_stepper, inp=[self.iterations, self.lr], Tout=tf.float32) # NOTE: K.update_add and K.update return tf.assign_add and tf.assign, respectively self.updates = [K.update_add(self.iterations, 1), K.update(self.lr, new_lr)] # Temporary code to debug output self.iterations = tf.Print(self.lr, [self.iterations, self.lr], message="\n Debug Vals:") --------------- My Changes Stop ---------------------------------- grads = self.get_gradients(loss, params) # momentum shapes = [K.int_shape(p) for p in params] moments = [K.zeros(shape) for shape in shapes] self.weights = [self.iterations] + moments etc...的输出将是消息“ Debug Vals：”，其后是用方括号括起来的self.iterations的当前值，然后是也用方括号括起来的self.lr的值（即学习率）括号。

当我使用此代码作为cifar10_cnn.py的优化器时，从Keras的示例中，我得到以下调试输出：

tf.Print

注意：虽然我期望在第2、4和6次迭代中发生变化，但是学习率在第3、5和7次迭代中发生了变化。（我相信）是因为self.iteration在调整self.lr之后增加了。

但是，当我第一次进行更改时，我使用的Debug Vals:[1][0.01] Debug Vals:[2][0.01] Debug Vals:[3][0.02] Debug Vals:[4][0.02] Debug Vals:[5][0.04] Debug Vals:[6][0.04] Debug Vals:[7][0.06] Debug Vals:[8][0.06] Debug Vals:[9][0.06]是：

"# Key lines to change self.lr"

当我这样做时，我看到了以下输出：

self.lr = tf.contrib.eager.py_func(func=lr_stepper,   inp=[self.iterations, self.lr], Tout=tf.float32)
self.updates = [K.update_add(self.iterations, 1)]

在这里，我可以看到在第二，第四和第六次迭代中学习率发生了变化（现在看起来很奇怪），但是在随后的迭代中却忘记了这些变化的值。

两种情况的区别在于，在一种情况下使用Debug Vals:[1][0.01] Debug Vals:[2][0.02] Debug Vals:[3][0.01] Debug Vals:[4][0.04] Debug Vals:[5][0.01] Debug Vals:[6][0.06] Debug Vals:[7][0.01] Debug Vals:[8][0.01] Debug Vals:[9][0.01]来更改K.updates的值，而在另一种情况下则没有。当Keras使用张量流后端lr时，执行K.updates。

tf.assign没有old_tensor = new_value是什么意思？

为什么需要tf.assign来永久更改张量流变量的值

0 个答案: