tensorflow修改py_func中的变量(及其grad func)

时间:2017-04-02 15:42:59

标签: tensorflow

在tensorflow中,我们可以通过以下方式定义自己的op及其渐变: https://gist.github.com/harpone/3453185b41d8d985356cbe5e57d67342

但是,我们可以在这些python函数中修改计算图中的任何变量。例如在" _MySquareGrad"功能

我假设我们可以通过以下方式获取变量:

var  = tf.get_variable('var')

然后做一些事情来改变它的价值,然后再分配给它? e.g。

tmp = var*10
var.assign(tmp)

谢谢!

另外当我们做var * 10时,我们是否必须将其转换为numpy?

背景:我熟悉自动差异化,但对Tensorflow和Python不熟悉。所以请指出任何语法问题,如果我的意图清楚,请告诉我。

1 个答案:

答案 0 :(得分:2)

您可以在这些python函数中修改计算图中的变量。您使用tmp = var*10的示例代码将起作用,并且不会将任何内容转换为numpy。

实际上你应该尽量避免转换为numpy,因为它会减慢计算速度。

修改

您可以将代码包含在_MySquareGrad函数的梯度计算图中:

def _MySquareGrad(op, grad):

  #first get a Variable that was created using tf.get_variable()
  with tf.variable_scope("", reuse=True):
    var  = tf.get_variable('var')

  #now create the assign graph:
  tmp = var*10.
  assign_op = var.assign(tmp)

  #now make the assign operation part of the grad calculation graph:
  with tf.control_dependencies([assign_op]):
    x = tf.identity(op.inputs[0])

  return grad * 20 * x

这是一个有效的例子:

import tensorflow as tf
from tensorflow.python.framework import ops
import numpy as np

# Define custom py_func which takes also a grad op as argument:
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

# Def custom square function using np.square instead of tf.square:
def mysquare(x, name=None):

    with ops.name_scope(name, "Mysquare", [x]) as name:
        sqr_x = py_func(np.square,
                        [x],
                        [tf.float32],
                        name=name,
                        grad=_MySquareGrad)  # <-- here's the call to the gradient
        return sqr_x[0]

### Actual gradient:
##def _MySquareGrad(op, grad):
    ##x = op.inputs[0]
    ##return grad * 20 * x  # add a "small" error just to see the difference:


def _MySquareGrad(op, grad):

  #first get a Variable that was created using tf.get_variable()
  with tf.variable_scope("", reuse=True):
    var  = tf.get_variable('var')

  #now create the assign graph:
  tmp = var*10.
  assign_op = var.assign(tmp)

  #now make the assign operation part of the grad calculation graph:
  with tf.control_dependencies([assign_op]):
    x = tf.identity(op.inputs[0])

  return grad * 20 * x



with tf.Session() as sess:
    x = tf.constant([1., 2.])

    var = tf.get_variable(name="var", shape=[], initializer=tf.constant_initializer(0.2))

    y = mysquare(x)
    tf.global_variables_initializer().run()

    print(x.eval(), y.eval(), tf.gradients(y, x)[0].eval())
    print("Now var is 10 times larger:", var.eval())