Question

我正在尝试使用我自己的数据来运行此convolutional auto encoder示例，因此我将其InputLayer修改为我的图像。但是，在输出层上存在尺寸问题。我确定问题在于UpSampling，但我不确定为什么会发生这种情况：这里是代码。

N, H, W = X_train.shape
input_img = Input(shape=(H,W,1))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

autoencoder.summary()

然后，当我运行时，抛出此错误：

i+=1
autoencoder.fit(x_train, x_train,
            epochs=50,
            batch_size=128,
            shuffle=True,
            validation_data=(x_test, x_test),
            callbacks= [TensorBoard(log_dir='/tmp/autoencoder/{}'.format(i))])

ValueError: Error when checking target: expected conv2d_23 to have shape (148, 84, 1) but got array with shape (150, 81, 1)

我回到了教程代码，并尝试查看其模型的摘要，并显示以下内容：

我确定在解码器上重建输出时存在问题，但我不确定为什么会这样，为什么它适用于128x28图像但不适用于150x81的地雷

我想我可以解决这个问题，改变我的形象的尺寸，但我想了解发生了什么，我该如何避免呢？

Answer 1

您可以使用ZeroPadding2D填充输入图像到32X32，然后使用Cropping2D裁剪解码图像。

from keras.layers import ZeroPadding2D, Cropping2D


input_img = Input(shape=(28,28,1))  # adapt this if using `channels_first` image data format
input_img_padding = ZeroPadding2D((2,2))(input_img)  #zero padding image to shape 32X32
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img_padding)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoded_cropping = Cropping2D((2,2))(decoded)

autoencoder = Model(input_img, decoded_cropping) #cropping image from 32X32 to 28X28
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

autoencoder.summary()

Answer 2

解码器的最后一层不使用任何填充。您可以通过将解码器的最后一层更改为以下内容来添加它：

x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)

您将看到输出暗淡现在将与输入暗淡匹配。

当UpSampling不匹配时，keras形状

2 个答案: