Question

我正在尝试找到TensorFlow的低级线性代数和矩阵算术运算符的源代码，以便在CPU上执行。例如，tf.add()的实际实现在CPU上执行的位置在哪里？据我所知，大多数线性代数运算符实际上是由Eigen实现的，但我想知道具体的特征函数被调用。

我尝试从高级API进行追溯，但这很困难，因为在图表上放置操作符和TF运行时实际执行操作符之间有很多步骤。

Answer 1

实现隐藏在一些元模板编程背后（对于Eigen来说并不罕见）。

TensorFlow中的每个操作都在某个时刻注册。 Add已注册here和here。

REGISTER3(BinaryOp, GPU, "Add", functor::add, float, Eigen::half, double);

操作的实际实施基于OpKernel。 Add操作在BinaryOp::Compute中实现。类层次结构将为BinaryOp : BinaryOpShared : OpKernel

在添加两个标量的情况下，整个实现只是：

functor::BinaryFunctor<Device, Functor, 1>().Right(
            eigen_device, out_flat, in0.template flat<Tin>(),
            in1.template scalar<Tin>(), error_ptr);

其中in0, in1是传入的Tensor-Scalars，Device是GPU或CPU，Functor是操作本身。其他线路仅用于执行广播。

向下滚动此文件并展开REGISTER3宏，说明如何将参数从REGISTER3传递到functor::BinaryFunctor<Device, Functor, ...>。

你不能指望看到一些循环作为Eigen使用表达式进行懒惰评估和别名。特征 - “呼叫”在这里：

https://github.com/tensorflow/tensorflow/blob/7a0def60d45c1841a4e79a0ddf6aa9d50bf551ac/tensorflow/core/kernels/cwise_ops.h#L693-L696

TensorFlow运算符源代码

1 个答案: