Question

我创建了一个theano.Op，它返回两个输入集合中每对的距离，转换scipy cdist：

class Cdist(theano.Op):

    __props__ = ()

    def __init__(self):
        #self.fn = scipy_cdist2
        super(Cdist, self).__init__()

    def make_node(self, x, w):
        #print('make_node')
        return gof.Apply(self, [x, w], [x.type()])

    def perform(self, node, inputs, output_storage):
        #print('perform')
        x, w = inputs[0], inputs[1]
        z = output_storage[0]
        z[0] = distance.cdist(x, w, 'euclidean')

它有效，但现在想要添加grad方法。我已阅读有关grad方法的guide和documentation。但我仍然不明白它是如何工作的。例如，在guide中获取返回a*x + b的方法的渐变，他们使用：

def grad(self, inputs, output_grads):
    return [a * output_grads[0] + b]

为什么呢？我将引用documentation中有关grad：

的内容

如果op的输出列表是[f_1，... f_n]，那么列表 output_gradients是[grad_ {f_1}（C），grad_ {f_2}（C），...， grad_ {F_N}（C）]。如果输入包含列表[x_1，...，x_m]，那么 Op.grad应返回列表[grad_ {x_1}（C），grad_ {x_2}（C），...， grad_ {x_m}（C）]，其中（grad_ {y}（Z））_ i = \ frac {\ partial Z} {\ partial y_i}（我可以代表多个维度）。

他们告诉我，我必须写渐变？但在示例中，将output_grads和整数值组合在一起。真的我不理解。

Answer 1

文档没有错。在grad方法中，您应该编写符号表达式，而不是编写数字表达式的perform方法。

从grad调用

theano.grad方法，而在编译函数内调用perform。

例如，假设欧几里德距离：

def grad(self, inputs, out_grads):
    x, y = inputs   # matrices of shape [mA, n] and [mB, n]]
    g, = out_grads   # matrix of shape [mA, mB]
    diff = x.dimshuffle(0, 'x', 1) - y.dimshuffle('x', 0, 1)   # [mA, mB, n] tensor
    z = T.sqrt(T.sum(T.sqr(diff), axis=2, keepdims=True))
    diff = g * diff / z
    return [T.sum(diff, axis=1), -T.sum(diff, axis=0)]

对于这种特殊情况，我建议您写一个L_op而不是grad。 L_op还在前方操作中重用输出。

def L_op(self, inputs, outputs, out_grads):
    x, y = inputs   # matrices of shape [mA, n] and [mB, n]
    z, = outputs   # matrix of shape [mA, mB]
    g, = out_grads   # idem
    diff = x.dimshuffle(0, 'x', 1) - y.dimshuffle('x', 0, 1)   # [mA, mB, n] tensor
    diff = g.dimshuffle(0, 1, 'x') * diff / z.dimshuffle(0, 1, 'x')
    return [T.sum(diff, axis=1), -T.sum(diff, axis=0)]

嗯，毕业表达可能不对，但你明白了。

如您所见，我们正在调用dimshuffle等符号函数。但是，有些情况下你想为grad Op编写一个类。要么是因为符号图效率太低，要么是想要自定义渐变。

例如：

class CDistGrad(theano.Op):
    def __init__(...):
        # <...>
        pass
    def c_code(...):
        # implement this in case you want more performance
        pass
    def perform(...):
        # <...>
        pass
    def make_node(...):
        # <...>
        pass

class CDist(theano.Op):
    # <...>
    def grad(self, inputs, output_grads):
        return CDistGrad()(*inputs, *output_grads)

但是，grad方法中使用了符号表达式。只是一个定制的Op取代了香草Theano的表达。

如何将the grad方法添加到theano Op？

1 个答案: