如何堆叠Tensorflow的多层conv2d_transpose()

时间:2016-05-25 18:25:02

标签: tensorflow deep-learning deconvolution

我正在尝试堆叠2层tf.nn.conv2d_transpose()来对张量进行上采样。它在前馈期间工作正常,但在向后传播期间出错: ValueError: Incompatible shapes for broadcasting: (8, 256, 256, 24) and (8, 100, 100, 24)

基本上,我只是将第一个conv2d_transpose的输出设置为第二个的输入:

convt_1 = tf.nn.conv2d_transpose(...)
convt_2 = tf.nn.conv2d_transpose(conv_1)

只使用一个conv2d_transpose,一切正常。仅当多个conv2d_transpose堆叠在一起时才会出现错误。

我不确定实现多层conv2d_transpose的正确方法。关于如何解决这个问题的任何建议都将非常感激。

这是一个复制错误的小代码:

import numpy as np
import tensorflow as tf

IMAGE_HEIGHT = 256
IMAGE_WIDTH = 256
CHANNELS = 1

batch_size = 8
num_labels = 2

in_data = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_HEIGHT, IMAGE_WIDTH, CHANNELS))
labels = tf.placeholder(tf.int32, shape=(batch_size, IMAGE_HEIGHT, IMAGE_WIDTH, 1))

# Variables
w0 = tf.Variable(tf.truncated_normal([3, 3, CHANNELS, 32]))
b0 = tf.Variable(tf.zeros([32]))

# Down sample
conv_0 = tf.nn.relu(tf.nn.conv2d(in_data, w0, [1, 2, 2, 1], padding='SAME') + b0)
print("Convolution 0:", conv_0)


# Up sample 1. Upscale to 100 x 100 x 24
wt1 = tf.Variable(tf.truncated_normal([3, 3, 24, 32]))
convt_1 = tf.nn.sigmoid(
          tf.nn.conv2d_transpose(conv_0, 
                                 filter=wt1, 
                                 output_shape=[batch_size, 100, 100, 24], 
                                 strides=[1, 1, 1, 1]))
print("Deconvolution 1:", convt_1)


# Up sample 2. Upscale to 256 x 256 x 2
wt2 = tf.Variable(tf.truncated_normal([3, 3, 2, 24]))
convt_2 = tf.nn.sigmoid(
          tf.nn.conv2d_transpose(convt_1, 
                                 filter=wt2, 
                                 output_shape=[batch_size, IMAGE_HEIGHT, IMAGE_WIDTH, 2], 
                                 strides=[1, 1, 1, 1]))
print("Deconvolution 2:", convt_2)

# Loss computation
logits = tf.reshape(convt_2, [-1, num_labels])
reshaped_labels = tf.reshape(labels, [-1])
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, reshaped_labels)
loss = tf.reduce_mean(cross_entropy)

optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

1 个答案:

答案 0 :(得分:6)

我想你需要改变你的步伐' conv2d_transpose中的参数。 conv2d_transposconv2d类似,但输入和输出相反。

对于conv2dstride和输入形状将决定输出形状。对于conv2d_transposestride和输出形状将决定输入形状。现在你的步幅是[1 1 1 1],这意味着conv2d_transpose的输出和输入大致相同(忽略边界效应)。

对于输入H = W = 100,stride = [1 2 2 1]conv2d_tranpose的输出应为200.(与conv2d相反),如果将padding设置为SAME 。简而言之,输入,输出和步幅需要兼容。