我正在使用 flikr8k 数据集开发图像字幕模型。当我尝试拟合模型时,出现此错误。
***ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 2048 but received input with shape (None, 1)***
这是我要复制的 https://www.kaggle.com/shadabhussain/automated-image-captioning-flickr8/comments 代码。
但是当我运行 model.fit 时,我得到了一个值错误。
conca = Concatenate()([image_model.output, language_model.output])
x = LSTM(128, return_sequences=True)(conca)
x = LSTM(512, return_sequences=False)(x)
x = Dense(vocab_size)(x)
out = Activation('softmax')(x)
model = Model(inputs=[image_model.input, language_model.input], outputs = out)
# model.load_weights("../input/model_weights.h5")
model.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['accuracy'])
model.summary()
hist = model.fit([images, captions], next_words, batch_size=512, epochs=200)
I have attached the image of the model architecture
谢谢