非常简单地说,当我在Keras代码中使用padding = 'same'
时,我的问题涉及图像大小与maxpool图层后的输入图像大小保持不变。我正在浏览Keras博客:Building Autoencoders in Keras。我正在构建卷积自动编码器。自动编码器代码如下:
input_layer = Input(shape=(28, 28, 1))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
根据autoencoder.summary()
,在第一个Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
图层之后输出的图像是28 X 28 X 16,即与输入图像尺寸相同。这是因为填充是'same'
。
In [49]: autoencoder.summary() (Numbering of layers is given by me and not produced in output) _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= 1.input_1 (InputLayer) (None, 28, 28, 1) 0 _________________________________________________________________ 2.conv2d_1 (Conv2D) (None, 28, 28, 16) 160 _________________________________________________________________ 3.max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16) 0 _________________________________________________________________ 4.conv2d_2 (Conv2D) (None, 14, 14, 8) 1160 _________________________________________________________________ 5.max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8) 0 _________________________________________________________________ 6.conv2d_3 (Conv2D) (None, 7, 7, 8) 584 _________________________________________________________________ 7.max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8) 0 _________________________________________________________________ 8.conv2d_4 (Conv2D) (None, 4, 4, 8) 584 _________________________________________________________________ 9.up_sampling2d_1 (UpSampling2 (None, 8, 8, 8) 0 _________________________________________________________________ 10.conv2d_5 (Conv2D) (None, 8, 8, 8) 584 _________________________________________________________________ 11.up_sampling2d_2 (UpSampling2 (None, 16, 16, 8) 0 _________________________________________________________________ 12.conv2d_6 (Conv2D) (None, 14, 14, 16) 1168 _________________________________________________________________ 13.up_sampling2d_3 (UpSampling2 (None, 28, 28, 16) 0 _________________________________________________________________ 14.conv2d_7 (Conv2D) (None, 28, 28, 1) 145 =================================================================
下一层(第3层)是,MaxPooling2D((2, 2), padding='same')(x)
。 summary()显示此图层的输出图像大小为14 X 14 X 16.但此图层中的填充也是'same'
。那么为什么输出图像尺寸不会保持为带有填充零的28 X 28 X 16?
此外,当第12层之后输出形状如何变为(14 X 14 X 16)时尚不清楚,当来自其早期层的输入形状为(16 X 16)时X 8)。
`
答案 0 :(得分:2)
下一层(第3层)是MaxPooling2D((2,2),padding =' same')(x)。 summary()将此图层的输出图像大小显示为14 X 14 X 16.但此图层中的填充也是相同的'。那么为什么输出图像尺寸不会保持为带有填充零的28 X 28 X 16?
似乎对padding的作用存在误解。填充只是处理角落情况(图像边界旁边的操作)。但是你有2x2的maxpooling操作,而在Keras中,默认的 stride 等于池大小,所以stride = 2,它将图像大小减半。您需要手动指定stride = 1以避免这种情况。来自Keras doc:
pool_size:2个整数的整数或元组,是缩小(垂直,水平)的因子。 (2,2)会将空间维度中的输入减半。如果只指定了一个整数,则两个维度都将使用相同的窗口长度。
步幅:整数,2个整数的元组或无。跨越价值观。 如果为None,则默认为pool_size 。
关于第二个问题
另外,在第12层之后输出形状如何变为(14 X 14 X 16)时尚不清楚,当来自其早期层的输入形状为(16 X 16 X 8)时。
第12层没有填充=指定相同。