预期conv2d_7的形状为(4,268,1),但数组的形状为(1,270,480)

时间:2019-03-19 20:22:31

标签: python machine-learning keras deep-learning autoencoder

我正在使用Keras构建的这种自动编码器遇到麻烦。输入的形状取决于屏幕尺寸,输出将是下一个屏幕尺寸的预测...但是似乎有一个错误,我无法弄清楚...请原谅我在此网站上的糟糕格式...

代码:

def model_build():
input_img = InputLayer(shape=(1, env_size()[1], env_size()[0]))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
model = Model(input_img, decoded)
return model
if __name__ == '__main__':
    model = model_build()
    model.compile('adam', 'mean_squared_error')
    y = np.array([env()])
    print(y.shape)
    print(y.ndim)
    debug = model.fit(np.array([[env()]]), np.array([[env()]]))

错误:

  

回溯(最近通话最近):     在第46行的“ /home/ai/Desktop/algernon-test/rewarders.py”文件中       调试= model.fit(np.array([[env()]]),np.array([[env()]]))     文件“ /home/ai/.local/lib/python3.6/site-packages/keras/engine/training.py”,第952行,适合       batch_size =批量大小)     _standardize_user_data中的第789行的文件“ /home/ai/.local/lib/python3.6/site-packages/keras/engine/training.py”       exception_prefix ='目标')     在standardize_input_data中的第138行中,文件“ /home/ai/.local/lib/python3.6/site-packages/keras/engine/training_utils.py”       str(数据形状))   ValueError:检查目标时出错:预期conv2d_7的形状为(4,268,1),但数组的形状为(1,270,480)

编辑:

导入为env()的get_screen的代码:

def get_screen():
    img = screen.grab()
    img = img.resize(screen_size())
    img = img.convert('L')
    img = np.array(img)
    return img

2 个答案:

答案 0 :(得分:1)

看起来像env_size()env()弄乱了图像尺寸。考虑以下示例:

image1 = np.random.rand(1, 1, 270, 480) #First dimension is batch size for test purpose
image2 = np.random.rand(1, 4, 268, 1) #Or any other arbitrary dimensions

input_img = layers.Input(shape=image1[0].shape)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(32, (3, 3), activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
model = tf.keras.Model(input_img, decoded)
model.compile('adam', 'mean_squared_error')
model.summary()

此行将起作用:

model.fit(image1, nb_epoch=1, batch_size=1)

但这不是

model.fit(image2, nb_epoch=1, batch_size=1)

编辑: 为了获得与输入相同大小的输出,您需要仔细计算卷积内核大小。 image1 = np.random.rand(1,1920,1080,1)

input_img = layers.Input(shape=image1[0].shape)
x = layers.Conv2D(32, 3, activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(16, 3, activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, 3, activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, 3, activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, 3, activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(32, 1, activation='relu')(x) # set kernel size to 1 for example
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, 3, activation='sigmoid', padding='same')(x)
model = tf.keras.Model(input_img, decoded)
model.compile('adam', 'mean_squared_error')
model.summary()

这将输出相同的尺寸。

按照本指南http://cs231n.github.io/convolutional-networks/

  

我们可以根据以下公式计算输出量的空间大小:   输入体积大小(W),转换层的接收场大小   神经元(F),对其应用的步幅(S)和   边框上使用的零填充量(P)。你可以说服   你自己那个计算多少神经元的正确公式   “适合”由(WF + 2P)/ S + 1 给出。例如7x7输入和3x3   跨度为1且填充为0的滤波器,我们将获得5x5的输出。大步向前   2我们将获得3x3的输出。

答案 1 :(得分:1)

您有三个 2x 下采样步骤和三个 x2 上采样步骤。这些步骤不知道原始图像大小,因此它们会将大小四舍五入到最接近的 8 = 2^3 倍数。

override func layoutSubviews() {
    guard !isAnimating else {
        return
    }
    
...

如果你添加一个新的最后一层,它应该可以工作......

cropX = 7 - ((size[0]+7) % 8)
cropY = 7 - ((size[1]+7) % 8)

cropX = 7 - ((npix+7) % 8)
cropY = 7 - ((nlin+7) % 8)