我尝试遵循Keras tutorial来为MNIST构建自动编码器。自动编码器正常工作,然后我尝试将图像更改,因此将形状从28, 28, 1
更改为150, 150, 3
,然后收到以下错误:
ValueError:检查目标时出错:预期conv2d_6具有 形状(148,148,1),但数组的形状为(150,150,3)
自动编码器体系结构:
input_img = Input(shape=(150, 150, 3))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer=Adam(0.01), loss='binary_crossentropy')
火车设置:
autoencoder.fit(x_train, y_train,
epochs=50,
batch_size=512,
shuffle=True,
validation_data=(x_test, y_test))
我的数据形状如下:
x_train shape: (4022, 150, 150, 3)
y_train shape: (4022, 150, 150, 3)
x_test shape: (447, 150, 150, 3)
y_test shape: (447, 150, 150, 3)
到我的工作空间的协作链接:
https://colab.research.google.com/drive/1C8RX7OYS2BXaHJh6VOMscxEbrFTuQY5H
答案 0 :(得分:2)
使用此代码,它将起作用
input_img = Input(shape=(150, 150, 3))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = ZeroPadding2D(padding=(1, 1), input_shape=(148, 148, 16))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='valid')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer=Adam(0.01), loss='binary_crossentropy')
我添加了零填充并更改了最后一层转换,以输出3个通道
这将打印以下摘要
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) (None, 150, 150, 3) 0
_________________________________________________________________
conv2d_58 (Conv2D) (None, 150, 150, 16) 448
_________________________________________________________________
max_pooling2d_25 (MaxPooling (None, 75, 75, 16) 0
_________________________________________________________________
conv2d_59 (Conv2D) (None, 75, 75, 8) 1160
_________________________________________________________________
max_pooling2d_26 (MaxPooling (None, 38, 38, 8) 0
_________________________________________________________________
conv2d_60 (Conv2D) (None, 38, 38, 8) 584
_________________________________________________________________
max_pooling2d_27 (MaxPooling (None, 19, 19, 8) 0
_________________________________________________________________
conv2d_61 (Conv2D) (None, 19, 19, 8) 584
_________________________________________________________________
up_sampling2d_25 (UpSampling (None, 38, 38, 8) 0
_________________________________________________________________
conv2d_62 (Conv2D) (None, 38, 38, 8) 584
_________________________________________________________________
up_sampling2d_26 (UpSampling (None, 76, 76, 8) 0
_________________________________________________________________
zero_padding2d_4 (ZeroPaddin (None, 78, 78, 8) 0
_________________________________________________________________
conv2d_63 (Conv2D) (None, 76, 76, 16) 1168
_________________________________________________________________
up_sampling2d_27 (UpSampling (None, 152, 152, 16) 0
_________________________________________________________________
conv2d_64 (Conv2D) (None, 150, 150, 3) 435
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
_________________________________________________________________
None
答案 1 :(得分:0)
我认为您的最终解码层应解码为三个通道,而不是一个。
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
应该是
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
不?