在tensorflow中,我们可以通过以下方式定义自己的op及其渐变: https://gist.github.com/harpone/3453185b41d8d985356cbe5e57d67342
但是,我们可以在这些python函数中修改计算图中的任何变量。例如在" _MySquareGrad"功能
我假设我们可以通过以下方式获取变量:
var = tf.get_variable('var')
然后做一些事情来改变它的价值,然后再分配给它? e.g。
tmp = var*10
var.assign(tmp)
谢谢!
另外当我们做var * 10时,我们是否必须将其转换为numpy?
背景:我熟悉自动差异化,但对Tensorflow和Python不熟悉。所以请指出任何语法问题,如果我的意图清楚,请告诉我。
答案 0 :(得分:2)
您可以在这些python函数中修改计算图中的变量。您使用tmp = var*10
的示例代码将起作用,并且不会将任何内容转换为numpy。
实际上你应该尽量避免转换为numpy,因为它会减慢计算速度。
修改强>
您可以将代码包含在_MySquareGrad函数的梯度计算图中:
def _MySquareGrad(op, grad):
#first get a Variable that was created using tf.get_variable()
with tf.variable_scope("", reuse=True):
var = tf.get_variable('var')
#now create the assign graph:
tmp = var*10.
assign_op = var.assign(tmp)
#now make the assign operation part of the grad calculation graph:
with tf.control_dependencies([assign_op]):
x = tf.identity(op.inputs[0])
return grad * 20 * x
这是一个有效的例子:
import tensorflow as tf
from tensorflow.python.framework import ops
import numpy as np
# Define custom py_func which takes also a grad op as argument:
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
# Def custom square function using np.square instead of tf.square:
def mysquare(x, name=None):
with ops.name_scope(name, "Mysquare", [x]) as name:
sqr_x = py_func(np.square,
[x],
[tf.float32],
name=name,
grad=_MySquareGrad) # <-- here's the call to the gradient
return sqr_x[0]
### Actual gradient:
##def _MySquareGrad(op, grad):
##x = op.inputs[0]
##return grad * 20 * x # add a "small" error just to see the difference:
def _MySquareGrad(op, grad):
#first get a Variable that was created using tf.get_variable()
with tf.variable_scope("", reuse=True):
var = tf.get_variable('var')
#now create the assign graph:
tmp = var*10.
assign_op = var.assign(tmp)
#now make the assign operation part of the grad calculation graph:
with tf.control_dependencies([assign_op]):
x = tf.identity(op.inputs[0])
return grad * 20 * x
with tf.Session() as sess:
x = tf.constant([1., 2.])
var = tf.get_variable(name="var", shape=[], initializer=tf.constant_initializer(0.2))
y = mysquare(x)
tf.global_variables_initializer().run()
print(x.eval(), y.eval(), tf.gradients(y, x)[0].eval())
print("Now var is 10 times larger:", var.eval())