如何在简单的自动编码器中解决尺寸错误?

时间:2018-12-21 11:26:04

标签: python tensorflow machine-learning keras autoencoder

我是python和autoencoders的新手。我只是想构建一个简单的自动编码器开始,但是我不断收到此错误:

ValueError: Error when checking target: expected conv2d_39 to have 4 dimensions, but got array with shape (32, 3)

除了flow_from_directory方法之外,还有没有更好的方法来获取我自己的数据?我建立了this之类的自动编码器,但是我走了一些层。

我不知道,但是我是否将flow_from_directory方法生成的元组输入自动编码器?有没有办法将此元组转换为自动编码器接受的格式?

import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense, Input, Conv2D, 
UpSampling2D, MaxPooling2D
from keras.optimizers import RMSprop

IMG_WIDTH, IMG_HEIGHT = 112, 112
input_img = Input(shape=(IMG_WIDTH, IMG_HEIGHT,3))

#encoder
def encoder(input_img):
    # 1x112x112x3
    conv1 = Conv2D(32,(3,3), activation='relu', padding='same') 
    (input_img) 
    # 32x112x112
    pool1 = MaxPooling2D(pool_size=(2,2))(conv1)
    # 32x56x56
    return pool1

#decoder
def decoder(pool1):
    # 32x56x56
    up1 = UpSampling2D((2,2))(pool1)
    # 32x112x112
    decoded = Conv2D(1,(3,3),activation='sigmoid',padding='same')(up1)
    # 1x112x112
    return decoded

autoencoder = Model(input_img, decoder(encoder(input_img)))
autoencoder.compile(loss='mean_squared_error', optimizer=RMSprop())

datagen = ImageDataGenerator(rescale=1./255)

training_set = datagen.flow_from_directory(
    r'C:\Users\user\Desktop\dataset\train',
    target_size=(112,112),
    batch_size=32,
    class_mode='categorical')

test_set = datagen.flow_from_directory(
    r'C:\Users\user\Desktop\dataset\validation',
    target_size=(112,112),
    batch_size=32,
    class_mode='categorical')

history = autoencoder.fit_generator(
    training_set,
    steps_per_epoch=2790,
    epochs=5,
    validation_data=test_set,
    validation_steps=1145)

这是模型摘要:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_14 (InputLayer)        (None, 112, 112, 3)       0         
_________________________________________________________________
conv2d_42 (Conv2D)           (None, 112, 112, 32)      896       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 56, 56, 32)        0         
_________________________________________________________________
up_sampling2d_4 (UpSampling2 (None, 112, 112, 32)      0         
_________________________________________________________________
conv2d_43 (Conv2D)           (None, 112, 112, 1)       289       
=================================================================
Total params: 1,185
Trainable params: 1,185
Non-trainable params: 0
_________________________________________________________________

我正在处理512x496个OCT图像。

2 个答案:

答案 0 :(得分:2)

由于您正在构建自动编码器,因此模型的输出必须与输入相同,因此代码存在两个问题:

  1. 您必须将generators的class_mode参数设置为'input',以使生成的标签与生成的输入相同。

  2. 由于输入图像具有3个通道,因此最后一层必须具有3个滤镜:decoded = Conv2D(3, ...)

答案 1 :(得分:0)

我相信您是在为网络提供标签,而不是图像。构造数据生成器时,尝试将class_mode显式设置为None -默认为categorical