我正在尝试在自变量的两个不同值处获得函数的梯度。我是在修改Keras优化器类的get_updates
方法的情况下这样做的。
我编写的代码的相关部分是
def get_updates(self, loss, params):
grads = self.get_gradients(loss, params)
self.updates = [K.update_add(self.iterations, 1)]
# Copy parameters
params2 = []
for p in params:
params2.append(K.variable(K.get_value(p), name=p.name[:-2] + "_cpy1/"))
self.weights = [self.iterations]
for p, p2, g in zip(params, params2, grads):
v = - self.lr * g
new_p = p + 0.5*v # Intermediate/Partial step
new_p2 = p2 + v # Reference point for 2nd Gradient
grads2 = self.get_gradients(loss, params2)
....
执行此代码时出现的错误是
File "/home/me/Projects/RungeKutta/rk_optimizers2.py", line 86, in get_updates
grads2 = self.get_gradients(loss, params2)
File "/home/me/anaconda2/lib/python2.7/site-packages/keras/optimizers.py", line 91, in get_gradients
raise ValueError('An operation has `None` for gradient. '
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
Keras的optimizers.py中有问题的代码部分是:
def get_gradients(self, loss, params):
grads = K.gradients(loss, params)
if None in grads:
raise ValueError('An operation has `None` for gradient. '
'Please make sure that all of your ops have a '
'gradient defined (i.e. are differentiable). '
'Common ops without gradient: '
'K.argmax, K.round, K.eval.')
在我的情况下,K
对应于Tensorflow。因此,gradients
的相应代码为:
def gradients(loss, variables):
"""Returns the gradients of `loss` w.r.t. `variables`.
# Arguments
loss: Scalar tensor to minimize.
variables: List of variables.
# Returns
A gradients tensor.
"""
return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
为什么我看到错误?如何在第一个投影的更新参数处获得梯度,以便可以使用起点处的梯度平均值和(初始)投影终点处的梯度平均值来进行最终更新?