Stackoverflow上有很多py_func
用法的例子,但我只想为自定义激活函数something like this定义渐变,它只使用tensorflow本机操作。 Identity正向传递的示例。
假设我已经为我的激活注册了渐变" OPLU" (评论说明了我到目前为止的理解):
@tf.RegisterGradient("OPLUGrad")
def oplugrad(op, grad):
x = op.inputs[0] # Need x !
# This print should be executed if oplugrad was launched!
# Because it was set inside the evaluation chain for output !
x = tf.Print(x, [tf.shape(x)], message = 'debug: ')
grad_new = x*grad # let it be, just for example
return grad_new
并定义了我的图层:
def tf_oplu(x, name="OPLU"):
y = ...f(x)...
# Here new op is created, as far as I understand
with ops.op_scope([x], name, "OPLUop") as name:
g = tf.get_default_graph()
# As far as I understand, here I issue command to tensorflow
# to use "OPLUGrad" when "OPLU" activation was applied
with g.gradient_override_map({"OPLU": "OPLUGrad"}):
# OK, gradient assigned, now return what forward layer computes
return y
但是我没有在渐变函数中看到tf.Print
的任何输出,这意味着它没有被执行。
问题1:如何正确注册并使用这两个函数来使用像AdamOptimizer这样的嵌入式优化器?
问题2:据我所知,标准梯度计算以这种方式被抑制。如果我想要计算标准渐变,然后在不干扰Session()代码的情况下进行一些修改,并在Session()运行中进行手动调用和修改渐变(我已经在某处看到过?),该怎么办?
编辑:Here is the example of code for which I want to replace tf.nn.relu with my tf_OPLU
谢谢!