带有填充的MaxPooling2D后的图像形状='相同' - 在卷积自动编码器中计算逐层形状

时间:2017-09-24 05:53:41

标签: deep-learning keras-layer keras-2

非常简单地说,当我在Keras代码中使用padding = 'same'时,我的问题涉及图像大小与maxpool图层后的输入图像大小保持不变。我正在浏览Keras博客:Building Autoencoders in Keras。我正在构建卷积自动编码器。自动编码器代码如下:

input_layer = Input(shape=(28, 28, 1))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

根据autoencoder.summary(),在第一个Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)图层之后输出的图像是28 X 28 X 16,即与输入图像尺寸相同。这是因为填充是'same'

In [49]: autoencoder.summary()
(Numbering of layers is given by me and not produced in output)
_________________________________________________________________
  Layer (type)                 Output Shape             Param #   
=================================================================
1.input_1 (InputLayer)         (None, 28, 28, 1)         0         
_________________________________________________________________
2.conv2d_1 (Conv2D)            (None, 28, 28, 16)        160       
_________________________________________________________________
3.max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16)        0         
_________________________________________________________________
4.conv2d_2 (Conv2D)            (None, 14, 14, 8)         1160      
_________________________________________________________________
5.max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8)           0         
_________________________________________________________________
6.conv2d_3 (Conv2D)            (None, 7, 7, 8)           584       
_________________________________________________________________
7.max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8)           0         
_________________________________________________________________
8.conv2d_4 (Conv2D)            (None, 4, 4, 8)           584       
_________________________________________________________________
9.up_sampling2d_1 (UpSampling2 (None, 8, 8, 8)           0         
_________________________________________________________________
10.conv2d_5 (Conv2D)            (None, 8, 8, 8)           584       
_________________________________________________________________
11.up_sampling2d_2 (UpSampling2 (None, 16, 16, 8)         0         
_________________________________________________________________
12.conv2d_6 (Conv2D)            (None, 14, 14, 16)        1168      
_________________________________________________________________
13.up_sampling2d_3 (UpSampling2 (None, 28, 28, 16)        0         
_________________________________________________________________
14.conv2d_7 (Conv2D)            (None, 28, 28, 1)         145       
=================================================================

下一层(第3层)是,MaxPooling2D((2, 2), padding='same')(x)。 summary()显示此图层的输出图像大小为14 X 14 X 16.但此图层中的填充也是'same'。那么为什么输出图像尺寸不会保持为带有填充零的28 X 28 X 16?

此外,当第12层之后输出形状如何变为(14 X 14 X 16)时尚不清楚,当来自其早期层的输入形状为(16 X 16)时X 8)。

`

1 个答案:

答案 0 :(得分:2)

  

下一层(第3层)是MaxPooling2D((2,2),padding =' same')(x)。 summary()将此图层的输出图像大小显示为14 X 14 X 16.但此图层中的填充也是相同的'。那么为什么输出图像尺寸不会保持为带有填充零的28 X 28 X 16?

似乎对padding的作用存在误解。填充只是处理角落情况(图像边界旁边的操作)。但是你有2x2的maxpooling操作,而在Keras中,默认的 stride 等于池大小,所以stride = 2,它将图像大小减半。您需要手动指定stride = 1以避免这种情况。来自Keras doc:

  

pool_size:2个整数的整数或元组,是缩小(垂直,水平)的因子。 (2,2)会将空间维度中的输入减半。如果只指定了一个整数,则两个维度都将使用相同的窗口长度。

     

步幅:整数,2个整数的元组或无。跨越价值观。 如果为None,则默认为pool_size

关于第二个问题

  

另外,在第12层之后输出形状如何变为(14 X 14 X 16)时尚不清楚,当来自其早期层的输入形状为(16 X 16 X 8)时。

第12层没有填充=指定相同。