与图像字幕模型中的图层不兼容

时间:2021-06-14 09:30:27

标签: python image deep-learning conv-neural-network

我正在使用 flikr8k 数据集开发图像字幕模型。当我尝试拟合模型时,出现此错误。

***ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 2048 but received input with shape (None, 1)***

这是我要复制的 https://www.kaggle.com/shadabhussain/automated-image-captioning-flickr8/comments 代码。

但是当我运行 model.fit 时,我得到了一个值错误。

conca = Concatenate()([image_model.output, language_model.output])
x = LSTM(128, return_sequences=True)(conca)
x = LSTM(512, return_sequences=False)(x)
x = Dense(vocab_size)(x)
out = Activation('softmax')(x)
model = Model(inputs=[image_model.input, language_model.input], outputs = out)

# model.load_weights("../input/model_weights.h5")
model.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['accuracy'])
model.summary()
hist = model.fit([images, captions], next_words, batch_size=512, epochs=200)

I have attached the image of the model architecture

谢谢

0 个答案:

没有答案