完全卷积自动编码器

时间:2017-05-31 09:01:42

标签: deep-learning convolution autoencoder deconvolution

我正在实现一个卷积自动编码器,我在为convolution_transpose层(在解码器中)找到正确的形状时遇到了很大困难。到目前为止,我的编码器看起来像

    ('convolution', num_outputs=256, kernel_size=48, stride=2, padding="SAME")
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution', num_outputs=256, kernel_size=32, stride=1, padding="SAME" )

现在,在解码器中我试图恢复它。使用:

    ('convolution_transpose', num_outputs=256, kernel_size=32, stride=2, padding="SAME")
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=7, stride=1, padding="SAME" )
    ('convolution_transpose', num_outputs=256, kernel_size=48, stride=2, padding="SAME" )
    ('convolution_transpose', num_outputs=1, kernel_size=48, stride=2, padding="SAME" )

我无法重现输入的大小。

Input Size:  (10, 161, 1800, 1)
Output Size: (10, 3600, 1024, 1)

有关解码器层的正确设置应该是什么的任何想法?

1 个答案:

答案 0 :(得分:1)

不确定您正在使用的平台或您要完成的任务,但您的输入大小应该可以被卷积层整除,否则您的输入将被填充(或裁剪)。除此之外,在张量流上,以下工作:

tf.layers.conv2d(in,256,3,2,'SAME',activation=tf.nn.relu)
tf.layers.conv2d_transpose(in,256,3,2,'SAME',activation=tf.nn.relu)

其中256是要素数量,3是内核大小(3x3),2是步幅。