Question

我正在为GPU编写自定义张量流Op，作为正向计算输出的一部分，我需要一个形状为（N，H，W）的随机张量。

我尝试编写类似于发现zeros function实现方式的功能：

template <typename Device, typename T>
struct TensorRandom {
  void operator()(const Device& d, typename TTypes<T>::Flat t) {
  t.device(d) = t.random(); // this is my only change to the TensorZeros function
}
};

在我的op的实现/内核中，我有一些类似的代码

TensorShape my_shape({N, H, W});
Tensor* random_mat;
OP_REQUIRES_OK(ctx, ctx->allocate_output("random_mat", my_shape, &random_mat));
const Device& device = ctx->eigen_device<Device>(); // this will be a gpu
functor::TensorRandom<Device, T>()(device, random_mat.flat<T>());

VLOG(1) << "Random Mat " << random_mat.shape().DebugString()
        << random_mat.SummarizeValue(N * H * W);

当我编译并运行它时，我在下面得到这个奇怪的模式。如果将每个i%4==0或i%4==1重复的元素展平，张数为4，则其他数字似乎足够随机。

Random Mat: (2,4,2) [[[0.93335256, 0.53328224],
        [0.18036943, 0.12565934],
        [0.93335256, 0.53328224],
        [0.042617  , 0.61869474]],

       [[0.93335256, 0.53328224],
        [0.70387461, 0.88239244],
        [0.93335256, 0.53328224],
        [0.76217792, 0.65087953]]]

我正在使用单精度浮点数。我在NVIDIA V100和Quadro M2200上都观察到了这一点。我尝试的任何张量形状，我都会得到这种模式。我想不出任何可以解释为什么发生这种情况的假设。

任何人都可以提出理论，甚至可以提出解决方案吗？另外，我选择了另一种方法来使用有效的随机数初始化Tensor* random_mat。我对TensorFlow代码库不太熟悉，所以这是我唯一想到的事情

如何在自定义Tensorflow Op内核中生成随机张量

0 个答案: