我正在学习卷积自动编码器,并且正在使用keras来构建图像降噪器。 以下代码可用于构建模型:
denoiser.add(Conv2D(32, (3,3), input_shape=(28,28,1), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(MaxPooling2D(pool_size=(2,2)))
denoiser.add(Conv2D(16, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(MaxPooling2D(pool_size=(2,2)))
denoiser.add(Conv2D(8, (3,3), padding='same'))
denoiser.add(Activation('relu'))
################## HEY WHAT NO MAXPOOLING?
denoiser.add(Conv2D(8, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(UpSampling2D((2,2)))
denoiser.add(Conv2D(16, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(UpSampling2D((2,2)))
denoiser.add(Conv2D(1, (3,3), padding='same'))
denoiser.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
denoiser.summary()
并给出以下摘要:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_155 (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
activation_162 (Activation) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d_99 (MaxPooling (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_156 (Conv2D) (None, 14, 14, 16) 4624
_________________________________________________________________
activation_163 (Activation) (None, 14, 14, 16) 0
_________________________________________________________________
max_pooling2d_100 (MaxPoolin (None, 7, 7, 16) 0
_________________________________________________________________
conv2d_157 (Conv2D) (None, 7, 7, 8) 1160
_________________________________________________________________
activation_164 (Activation) (None, 7, 7, 8) 0
_________________________________________________________________
conv2d_158 (Conv2D) (None, 7, 7, 8) 584
_________________________________________________________________
activation_165 (Activation) (None, 7, 7, 8) 0
_________________________________________________________________
up_sampling2d_25 (UpSampling (None, 14, 14, 8) 0
_________________________________________________________________
conv2d_159 (Conv2D) (None, 14, 14, 16) 1168
_________________________________________________________________
activation_166 (Activation) (None, 14, 14, 16) 0
_________________________________________________________________
up_sampling2d_26 (UpSampling (None, 28, 28, 16) 0
_________________________________________________________________
conv2d_160 (Conv2D) (None, 28, 28, 1) 145
=================================================================
Total params: 8,001
Trainable params: 8,001
Non-trainable params: 0
_________________________________________________________________
我不确定如何计算MaxPooling2D
,Conv2D
,UpSampling2D
输出大小。我已经阅读了keras文档,但仍然感到困惑。有许多参数会影响输出形状,例如Conv2D图层的stride
或padding
,但我不知道它如何精确影响输出形状。
我不明白为什么在注释行之前没有MaxPooling2D
层。编辑代码以在注释上方包含convmodel3.add(MaxPooling2D(pool_size=(2,2)))
层,它将最终输出形状变为(None,12,12,1)
编辑代码以在注释之前包含convmodel3.add(MaxPooling2D(pool_size=(2,2)))
层,然后convmodel3.add(UpSampling2D((2,2)))
将最终输出更改为(None,24,24,1)。这不是(None,28,28,1)吗?
代码和摘要:
convmodel3 = Sequential()
convmodel3.add(Conv2D(32, (3,3), input_shape=(28,28,1), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2)))
convmodel3.add(Conv2D(16, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2)))
convmodel3.add(Conv2D(8, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2))) # ADDED MAXPOOL
################## HEY WHAT NO MAXPOOLING?
convmodel3.add(UpSampling2D((2,2))) # ADDED UPSAMPLING
convmodel3.add(Conv2D(16, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(UpSampling2D((2,2)))
convmodel3.add(Conv2D(32, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(UpSampling2D((2,2)))
convmodel3.add(Conv2D(1, (3,3), padding='same'))
convmodel3.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
convmodel3.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_247 (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
activation_238 (Activation) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d_141 (MaxPoolin (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_248 (Conv2D) (None, 14, 14, 16) 4624
_________________________________________________________________
activation_239 (Activation) (None, 14, 14, 16) 0
_________________________________________________________________
max_pooling2d_142 (MaxPoolin (None, 7, 7, 16) 0
_________________________________________________________________
conv2d_249 (Conv2D) (None, 7, 7, 8) 1160
_________________________________________________________________
activation_240 (Activation) (None, 7, 7, 8) 0
_________________________________________________________________
max_pooling2d_143 (MaxPoolin (None, 3, 3, 8) 0
_________________________________________________________________
up_sampling2d_60 (UpSampling (None, 6, 6, 8) 0
_________________________________________________________________
conv2d_250 (Conv2D) (None, 6, 6, 16) 1168
_________________________________________________________________
activation_241 (Activation) (None, 6, 6, 16) 0
_________________________________________________________________
up_sampling2d_61 (UpSampling (None, 12, 12, 16) 0
_________________________________________________________________
conv2d_251 (Conv2D) (None, 12, 12, 32) 4640
_________________________________________________________________
activation_242 (Activation) (None, 12, 12, 32) 0
_________________________________________________________________
up_sampling2d_62 (UpSampling (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_252 (Conv2D) (None, 24, 24, 1) 289
=================================================================
Total params: 12,201
Trainable params: 12,201
Non-trainable params: 0
_________________________________________________________________
None
在输出形状中的意义是什么?
此外,将Conv2D
图层编辑为不包含填充,则会引发错误:
ValueError: Negative dimension size caused by subtracting 3 from 2 for 'conv2d_240/convolution' (op: 'Conv2D') with input shapes: [?,2,2,16], [3,3,16,32].
为什么?
答案 0 :(得分:1)
对于卷积(此处为2D)图层,要考虑的重点是图像的体积(宽度x高度x深度)以及您提供的四个参数。这些参数是
输出形状的公式为
这是从what is the effect of tf.nn.conv2d() on an input tensor shape?线程中获取的,有关零填充等的更多信息可以在此处找到。
对于最大池化和上采样,其大小仅受池大小和跨度的影响。在您的示例中,池大小为(2,2),并且未定义步幅(因此默认为池大小,请参见此处https://keras.io/layers/pooling/)。上采样的工作原理相同。池大小仅占用一个2x2像素的池,找到它们的总和并将它们放到一个像素中。因此将2x2像素转换为1x1像素,对其进行编码。上采样是一回事,只是在池上重复这些值,而不是对像素值求和。
之所以没有maxpooling图层以及图像尺寸混乱的原因是由于该阶段的图像尺寸。观察网络,图像尺寸已经为[7,7,8]。池大小和步幅分别为(2,2)和2时,会将图像的分辨率降低到[3,3,8]。在上采样层之后,维数将从3-> 6-> 12-> 24开始,您在每一行和每一列中损失了4个像素。
无(None)的意义(如果我错了,我不确定100%肯定,请纠正我)是由于网络期望卷积层上通常有多个图像。通常,预期尺寸为
[Number of images, Width, Height, Depth]
因此,第一个元素为“ no”的原因是您的网络一次只期待一个图像,因此它被指定为“ None”(同样,我对此不太确定)。