谁能帮我与keras合并图层

时间:2019-02-28 08:03:27

标签: python machine-learning keras

我从https://github.com/raducrs/Applications-of-Deep-Learning/blob/master/Image%20captioning%20Flickr8k.ipynb看到了这段代码,并尝试使其在google colab中运行,但是当我在下面的代码中运行时,它给了我错误。它说

  

不建议使用合并

我想知道如何在keras最新版本中运行此代码。

LSTM_CELLS_CAPTION = 256
LSTM_CELLS_MERGED = 1000

image_pre = Sequential()
image_pre.add(Dense(100, input_shape=(IMG_FEATURES_SIZE,), activation='relu', name='fc_image'))
image_pre.add(RepeatVector(MAX_SENTENCE,name='repeat_image'))

caption_model = Sequential()
caption_model.add(Embedding(VOCABULARY_SIZE, EMB_SIZE,
                            weights=[embedding_matrix],
                            input_length=MAX_SENTENCE,
                            trainable=False, name="embedding"))
caption_model.add(LSTM(EMB_SIZE, return_sequences=True, name="lstm_caption"))
caption_model.add(TimeDistributed(Dense(100, name="td_caption")))

combined = Sequential()
combined.add(Merge([image_pre, caption_model], mode='concat', concat_axis=1,name="merge_models"))
combined.add(Bidirectional(LSTM(256,return_sequences=False, name="lstm_merged"),name="bidirectional_lstm"))
combined.add(Dense(VOCABULARY_SIZE,name="fc_merged"))
combined.add(Activation('softmax',name="softmax_combined"))

predictive = Model([image_pre.input, caption_model.input],combined.output)

1 个答案:

答案 0 :(得分:1)

Merge(mode='concat')现在为Concatenate(axis=1)

以下代码可在colab上正确生成图形。

from tensorflow.python import keras
from keras.layers import *
from keras.models import Model, Sequential

IMG_FEATURES_SIZE = 10
MAX_SENTENCE = 80
VOCABULARY_SIZE = 1000
EMB_SIZE = 100

embedding_matrix = np.zeros((VOCABULARY_SIZE, EMB_SIZE))

LSTM_CELLS_CAPTION = 256
LSTM_CELLS_MERGED = 1000

image_pre = Sequential()
image_pre.add(Dense(100, input_shape=(IMG_FEATURES_SIZE,), activation='relu', name='fc_image'))
image_pre.add(RepeatVector(MAX_SENTENCE,name='repeat_image'))

caption_model = Sequential()
caption_model.add(Embedding(VOCABULARY_SIZE, EMB_SIZE,
                            weights=[embedding_matrix],
                            input_length=MAX_SENTENCE,
                            trainable=False, name="embedding"))
caption_model.add(LSTM(EMB_SIZE, return_sequences=True, name="lstm_caption"))
caption_model.add(TimeDistributed(Dense(100, name="td_caption")))

merge = Concatenate(axis=1,name="merge_models")([image_pre.output, caption_model.output])
lstm = Bidirectional(LSTM(256,return_sequences=False, name="lstm_merged"),name="bidirectional_lstm")(merge)
output = Dense(VOCABULARY_SIZE, name="fc_merged", activation='softmax')(lstm)

predictive = Model([image_pre.input, caption_model.input], output)
predictive.compile('sgd', 'binary_crossentropy')
predictive.summary()

说明:

这是一个模型,每个样本有2个输入:图像和标题(单词序列)。 输入图在连接点(name ='merge_models')合并

图像仅由Dense层处理(您可能希望将卷积添加到image分支);然后将该密集层的输出复制MAX_SENTENCE次以准备合并。

字幕由LSTM和Dense层处理。

合并将导致MAX_SENTENCE个时间步长都具有两个分支的特征。

然后,合并的分支最终从VOCABULARY_SIZE中预测出一个类别。

model.summary()是理解图形的好方法。