Question

我读到这篇文章关于使用“resize convolutions”而不是“deconvolution”（即转置卷积）方法来生成具有神经网络的图像。很明显，如果步幅大小为1，它是如何工作的，但是如果步幅大小> 1，你将如何实现它？

以下是我在TensorFlow中实现此功能的方法。注意：这是自动编码器网络解码器部分的第二个“反卷积”层。

h_d_upsample2 = tf.image.resize_images(images=h_d_conv3,
                                       size=(int(self.c2_size), int(self.c2_size)),
                                       method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
h_d_conv2 = tf.layers.conv2d(inputs=h_d_upsample2,
                             filters=FLAGS.C2,
                             kernel_size=(FLAGS.c2_kernel, FLAGS.c2_kernel),
                             padding='same',
                             activation=tf.nn.relu)

Answer 1

我们使用transposed convolution的原因是为了提高activation maps的分辨率。现在，如果您想用conv2d with stride替换它，那么当您的目标是提高输出分辨率时，就会降低分辨率。

话虽如此，您可以使用strides，但您需要应用更大的重新缩放因子才能达到所需的分辨率。

Answer 2

调整图像大小确实不适合中间网络层。您可以尝试conv2d_transpose

如果步幅大小＆gt; 1，你将如何实现它？

# best practice is to use the transposed_conv2d function, this function works with stride >1 . 
# output_shape_width_height = stride * input_shape_width_height
# input_shape = [32, 32, 48], output_shape = [64, 64, 128]
stride = 2
filter_size_w =filter_size_h= 2 
shape = [filter_size_w, filter_size_h, output_shape[-1], input_shape[-1]]
w = tf.get_variable(
    name='W',
    shape=shape,
    initializer=tf.contrib.layers.variance_scalling_initializer(),
    trainable=trainable)

output = tf.nn.conv2d_transpose(
        input, w, output_shape=output_shape, strides=[1, stride, stride, 1])

您是否只能通过调整大小卷积来获得步幅大小？

2 个答案: