Question

我正在尝试降低图像的分辨率以加快训练速度。所以我用tf.nn.max_pool方法操作我的原始图像。我期待得到的图像是模糊的，尺寸较小，但事实并非如此。

我的原始图像有形状[320,240,3]，它看起来像：

在max_pooling之后，ksize=[1,2,2,1]和strides=[1,2,2,1]变为

由以下代码生成：

# `img` is an numpy.array with shape [320, 240, 3]
# since tf.nn.max_pool only receives tensor with size 
# [batch_size, height,width,channel], so I need to reshape 
# the image to have a dummy dimension.

img_tensor = tf.placeholder(tf.float32, shape=[1,320,240,3])
pooled = tf.nn.max_pool(img_tensor, ksize=[1,2,2,1], strides=[1,2,2,1],padding='VALID')
pooled_img = pooled.eval(feed_dict={img_tensor: img.reshape([1,320,240,3])})
plt.imshow(np.squeeze(pooled_img, axis=0))

合并后的图像具有预期的形状[160,120,3]。它只是变形行为真的让我很困惑。它不应该具有“重复移位”行为，因为没有像素重叠计算。

非常感谢提前。

Answer 1

我认为问题在于你的图像是如何重塑的。该图像实际上具有[240,320,3]的形状。

因此尝试使用[1,240,320,3]而不是[1,320,240,3]）。它应该工作。

Tensorflow

1 个答案: