Question

我正在尝试将Tensorflow图转换为CoreML，并且正在关注this tutorial。有一些我不太了解的代码：

#include <metal_stdlib>
using namespace metal;

kernel void swish(
  texture2d_array<half, access::read> inTexture [[texture(0)]],
  texture2d_array<half, access::write> outTexture [[texture(1)]],
  ushort3 gid [[thread_position_in_grid]])
{
  if (gid.x >= outTexture.get_width() || 
      gid.y >= outTexture.get_height()) {
    return;
  }

  const float4 x = float4(inTexture.read(gid.xy, gid.z));
  const float4 y = x / (1.0f + exp(-x));             
  outTexture.write(half4(y), gid.xy, gid.z);
}

我不明白的是这里使用gid。网格不是二维的吗？ gid.z代表什么？ gid.x不是当前像素的当前x坐标吗？

Answer 1

gid.x和gid.y是当前像素的x / y坐标。因此，当您执行texture.read(gid.xy)时，它将为您提供4个通道的像素数据值。

但是神经网络中使用的“图像”可能有4个以上的通道。这就是为什么纹理的数据类型是texture2d_array<>而不是texture2d<>的原因。

gid.z值引用此数组中纹理“切片”的索引。如果图像/张量具有32个通道，则有8个纹理切片（因为每个纹理最多存储4个数据通道）。

所以网格实际上是三维的：（x，y，slice）。

将Tensorflow图转换为CoreML

1 个答案: