在Tensorflow中查找方法的实现

时间:2018-03-06 15:15:08

标签: python optimization tensorflow gradient-descent

我想更改Tensorflow中使用的AdadeltaOptimizer之类的最小化优化器。 我获得了许可证,但lib中没有代码,只有参考,所以如何找到实现?这是API的Adadelta示例:

@tf_export("train.AdadeltaOptimizer") class 
AdadeltaOptimizer(optimizer.Optimizer) 
Optimizer that implements the Adadelta algorithm.
See [M. D. Zeiler](http://arxiv.org/abs/1212.5701) ([pdf]
(http://arxiv.org/pdf/1212.5701v1.pdf))

1 个答案:

答案 0 :(得分:1)

第一个入口点是tensorflow主回购的python/training/adadelta.py。但是您可能会注意到它是一个python包装器,所有操作实际上都是用本机C ++实现的并且在python中加载(这是tensorflow中的常用做法,例如参见这个问题:Where is the code for gradient descent?)。

例如,在core/kernels/training_ops.cc中,您可以找到ApplyAdadelta op的CPU impelmentation。同一操作的GPU实现在core/kernels/training_ops_gpu.cu.cc

template <typename T>
struct ApplyAdadelta<GPUDevice, T> {
  void operator()(const GPUDevice& d, typename TTypes<T>::Flat var,
                  typename TTypes<T>::Flat accum,
                  typename TTypes<T>::Flat accum_update,
                  typename TTypes<T>::ConstScalar lr,
                  typename TTypes<T>::ConstScalar rho,
                  typename TTypes<T>::ConstScalar epsilon,
                  typename TTypes<T>::ConstFlat grad) {
    Eigen::array<typename TTypes<T>::Tensor::Index, 1> bcast;
    bcast[0] = grad.dimension(0);
    Eigen::Sizes<1> single;

    accum.device(d) = accum * rho.reshape(single).broadcast(bcast) +
                      grad.square() * (grad.constant(T(1)) -
                                       rho.reshape(single).broadcast(bcast));
    const auto update =
        (accum_update + epsilon.reshape(single).broadcast(bcast)).sqrt() *
        (accum + epsilon.reshape(single).broadcast(bcast)).rsqrt() * grad;
    var.device(d) -= update * lr.reshape(single).broadcast(bcast);
    accum_update.device(d) =
        accum_update * rho.reshape(single).broadcast(bcast) +
        update.square() *
            (grad.constant(T(1)) - rho.reshape(single).broadcast(bcast));
  }
};

如果您要修补C ++代码,则必须重建.so库。为了能够在CPU和GPU上运行新的优化器,您必须触摸并重建它们。