使用3d_transposed_convolution层进行升采样

时间:2019-01-10 07:14:42

标签: tensorflow transpose convolution deconvolution

假设我来自上一层具有形状x的4D张量[2, 2, 7, 7, 64],其中batch = 2depth = 2height = 7width = 7,和in_channels = 64

我想将其上采样到形状为[2, 4, 14, 14, 32]的张量。 也许下一步是将其转换为[2, 8, 28, 28, 16][2, 16, 112, 112, 1]等形状。

我是Tensorflow的新手,我知道CAFFE和Tensorflow之间的转置卷积的实现是不同的。我的意思是,在CAFFE中,您可以通过更改内核的步幅来定义输出的大小。但是,它在张量流中更为复杂。

那么我该如何使用tf.layers.conv3d_transposetf.nn.conv3d_transpose来做到这一点?

有人会帮我吗?谢谢!

1 个答案:

答案 0 :(得分:0)

您可以同时使用tf.layers.conv3d_transposetf.nn.conv3d_transpose进行升采样。

让我们将输入张量视为:

input_layer = tf.placeholder(tf.float32, (2, 2, 7, 7, 64)) # batch, depth, height, width, in_channels

使用tf.nn.conv3d_transpose,我们需要注意变量(权重和偏差)的创建:

def conv3d_transpose(name, l_input, w, b, output_shape, stride=1):
    transp_conv = tf.nn.conv3d_transpose(l_input, w, output_shape, strides=[1, stride, stride, stride, 1], padding='SAME')
    return tf.nn.bias_add(transp_conv, b, name=name)

# Create variables for the operation
with tf.device('/cpu:0'):  
    # weights will have the shape [depth, height, width, output_channels, in_channels]
    weights = tf.get_variable(name='w_transp_conv', shape=[3, 3, 3, 32, 64])
    bias = tf.get_variable(name='b_transp_conv', shape=[32]) 

t_conv_layer = conv3d_transpose('t_conv_layer', input_layer, weights, bias,
                                output_shape=[2, 4, 14, 14, 32], stride=2)
print(t_conv_layer) 
# Tensor("t_conv_layer:0", shape=(2, 4, 14, 14, 32), dtype=float32)

使用tf.layers.conv3d_transpose(将同时处理权重和偏差),我们使用相同的输入张量input_layer

t_conv_layer2 = tf.layers.conv3d_transpose(input_layer, filters=32, kernel_size=[3, 3, 3],
                                           strides=(2, 2, 2), padding='SAME', name='t_conv_layer2')
print(t_conv_layer2)  
# Tensor("t_conv_layer2/Reshape_1:0", shape=(2, 4, 14, 14, 32), dtype=float32) 

要获取其他向上采样的张量,您可以通过根据需要更改步幅来重复此过程:

tf.layers.conv3d_transpose示例:

t_conv_layer3 = tf.layers.conv3d_transpose(t_conv_layer2, filters=16, kernel_size=[3, 3, 3],
                                           strides=(2, 2, 2), padding='SAME', name='t_conv_layer3')
t_conv_layer4 = tf.layers.conv3d_transpose(t_conv_layer3, filters=8, kernel_size=[3, 3, 3], 
                                           strides=(2, 2, 2), padding='SAME',   name='t_conv_layer4')
t_conv_layer5 = tf.layers.conv3d_transpose(t_conv_layer4, filters=1, kernel_size=[3, 3, 3], 
                                           strides=(1, 2, 2), padding='SAME', name='t_conv_layer5')
print(t_conv_layer5)
# Tensor("t_conv_layer5/Reshape_1:0", shape=(2, 16, 112, 112, 1), dtype=float32)

注意:由于tf.nn.conv3d_transpose实际上是tf.nn.conv3d的斜率,因此可以考虑变量正向运算,以确保变量output_shape是正确的。 tf.nn.conv3d。

def print_expected(weights, shape, stride=1):
    output = tf.constant(0.1, shape=shape)
    expected_layer = tf.nn.conv3d(output, weights, strides=[1, stride, stride, stride, 1], padding='SAME')
    print("Expected shape of input layer when considering the output shape ({} and stride {}): {}".format(shape, stride, expected_layer.get_shape()))

因此,要产生形状为[2、4、14、14、14、32]的转置卷积,我们可以检查例如步幅1和2:

print_expected(weights, shape=[2, 4, 14, 14, 32], stride=1) 
print_expected(weights, shape=[2, 4, 14, 14, 32], stride=2)

打印并确认第二个选项(使用步幅2)是生成具有所需形状的张量的正确选择:

Expected shape of input layer when considering the output shape ([2, 4, 14, 14, 32] and stride 1): (2, 4, 14, 14, 64)
Expected shape of input layer when considering the output shape ([2, 4, 14, 14, 32] and stride 2): (2, 2, 7, 7, 64)