Question

我想从子图（由几个连接的操作节点组成）组装一个新的操作。然后在新操作上应用自行设计的渐变。（重点是忽略子图中的梯度流，以及从NEW OP的输出张量到NEW OP的输入张量的桥梯度）。希望有人可以帮忙!!

Answer 1

您可以将子图包装到TensorFlow函数中，并为该函数指定自定义渐变，如

中所述

    @function.Defun(dtype, dtype, dtype)
    def XentLossGrad(logits, labels, dloss):
      dlogits = array_ops.reshape(dloss, [-1, 1]) * (
          nn_ops.softmax(logits) - labels)
      dlabels = array_ops.zeros_like(labels)
      # Takes exp(dlogits) to differentiate it from the "correct" gradient.
      return math_ops.exp(dlogits), dlabels

    @function.Defun(dtype, dtype, grad_func=XentLossGrad)
    def XentLoss(logits, labels):
      return math_ops.reduce_sum(labels * math_ops.log(nn_ops.softmax(logits)),
                                 1)

如何从子图组装新操作，并将渐变应用于新组装的操作

1 个答案: