Question

tensorflow的tf.nn.pool 在每个nhood /窗口中获取平均像素或最大像素

我想实现随机池化，即在每个窗口中获取一个随机像素

在此输入上应用3x3的内核/窗口/ nhood大小

#include <iostream> #include <fstream> using namespace std; int main () { ifstream file("/path/to/your/file/yourfile.txt"); // From the root directory (in linux) size_t size = 9; // Make this program to be more dynamically char array[size][size]; for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { array[i][j] = '0'; } } int x, y; char padding; while (!file.eof()) // While there is somthing more in the file { file >> padding >> x >> padding >> y >> padding; /* read char '[', then 'x'(number), then ',', then 'y'(number), and finally ']' */ array[x][y] = '1'; // Place the mark } for (int i = 0; i < size; i++) { cout << i << " "; for (int j = 0; j < size; j++) { cout << array[i][j] << " "; } cout << endl; } cout << " "; for (int i = 0; i < size; i++) cout << i << " "; cout << endl; return 0; }

将产生一个随机的小写字母和一个随机的大写字母

第二次跑步可能会给

| gC |

Answer 1

您的问题和您的示例实际上并未表达相同的问题。让我假设您的示例反映了您的需求：您想将输入拆分为不重叠的图块，并从每个图块中提取随机样本。

在这种情况下，您在与输入尺寸相同的替代随机图像上使用tf.nn.max_pool_with_argmax。

my_input = ... # for example a 1x3x6x1 tensor similar to your example
tile_size = [1, 3, 3, 1]
r = tf.random_uniform(my_input.shape)
maxr, idxs = tf.nn.max_pool_with_argmax(r, tile_size, tile_size, 'SAME')
rand_samp_per_tile = tf.reshape(tf.gather(tf.reshape(my_input, [-1]), idxs), maxr.shape)

在重叠的滑动窗口上的 real 随机池不能依靠此技巧，您基本上需要自己进行随机采样。这会要求更高，主要是因为您将不得不根据张量大小，窗口大小和填充类型来处理滑动窗口的偏移量。

Answer 2

我创建了一个样式为tf.nn.max_pool和tf.layers.max_pooling2d之后的函数您可以在我的github gist

中找到完整版本

我创建了一个恒定张量，用于容纳每个轮询窗口的角点，然后使用tf.random_uniform

来创建另一个张量，该张量从每个窗口角保持两个随机偏移（沿高度和沿宽度一个）

tr = tf.random_uniform((N, ch, cw, 2), 0, k, tf.int32)

沿着颜色通道的偏移量应该相同，为此我使用了tf.stack和tf.transpose

最后我用tf.gather_nd来拾取像素并形成最终张量

N, H, W, C = inputs.shape
CH, CW =  (H, W) if padding==PADDING_SAME else (H-k+1, W-k+1)
# corner points
c = np.array([ [ [ [ (i, h, w, j) for j in range(C) ] for w in range(0, CW, s) ] for h in range(0, CH, s) ] for i in range(N)])
ch, cw = c.shape[1], c.shape[2]
tc = tf.constant(c, dtype=tf.int32)
# random offset from corner, same shape as center point, contains redundant axis for C
tr = tf.random_uniform((N, ch, cw, 2), 0, k, tf.int32)
# repeat each C times, and make [y, x] into [0, y, x, 0]
tr = tf.transpose(tf.stack([tr for i in range(C)]), [1, 2, 3, 0, 4])
tr = tf.pad(tr, [[0,0], [0,0], [0,0], [0,0], [ 1, 1 ]])
# index of randomly shifted center point
if padding==PADDING_VALID:
    ix = tc+tr
else:
    # max points
    m = np.array([ [ [ [ (N-1, H-1, W-1, C-1) for j in range(C) ] for w in range(0, CW, s) ] for h in range(0, CH, s) ] for i in range(N)])
    tm = tf.constant(m, dtype=tf.int32)
    ix = tf.minimum(tc+tr, tm)
return tf.gather_nd(inputs, ix, name=name)

如何在张量流中实现随机池，该池在每个滑动窗口中需要随机像素

2 个答案: