这是我尝试使用串联操作合并的两个神经元网络。网络应按1好的和0不好的电影对IMDB电影评论进行分类
def cnn_lstm_merged():
embedding_vecor_length = 32
cnn_model = Sequential()
cnn_model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
cnn_model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
cnn_model.add(MaxPooling1D(pool_size=2))
cnn_model.add(Flatten())
lstm_model = Sequential()
lstm_model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
lstm_model.add(LSTM(64, activation = 'relu'))
lstm_model.add(Flatten())
merge = concatenate([lstm_model, cnn_model])
hidden = (Dense(1, activation = 'sigmoid'))(merge)
#print(model.summary())
output = hidden.fit(X_train, y_train, epochs=3, batch_size=64)
return output
但是当我运行代码时,会出现错误:
File "/home/pythonist/Desktop/EnsemblingLSTM_CONV/train.py", line 59, in cnn_lstm_merged
lstm_model.add(Flatten())
File "/home/pythonist/deeplearningenv/lib/python3.6/site-packages/keras/engine/sequential.py", line 185, in add
output_tensor = layer(self.outputs[0])
File "/home/pythonist/deeplearningenv/lib/python3.6/site-packages/keras/engine/base_layer.py", line 414, in __call__
self.assert_input_compatibility(inputs)
File "/home/pythonist/deeplearningenv/lib/python3.6/site-packages/keras/engine/base_layer.py", line 327, in assert_input_compatibility
str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer flatten_2: expected min_ndim=3, found ndim=2
[Finished in 4.8s with exit code 1]
如何合并这两层?谢谢
答案 0 :(得分:0)
在Flatten
之后不需要使用LSTM
,因为LSTM
(默认情况下)仅返回 last 状态,而不返回序列,也就是说,数据的形状为(BS, n_output)
,但是Flatten
层期望的形状为(BS, a, b)
,它将转换为(BS, a*b)
。
因此,要么删除Flatten
层并仅使用最后一个状态,要么将return_sequences=True
添加到LSTM
中。这将使LSTM
返回所有输出,而不仅仅是最后一个输出,即(BS, T, n_out)
。
编辑:此外,创建最终模型的方式也是错误的。请看一下this示例;对于您来说,应该是这样的:
merge = Concatenate([lstm_model, cnn_model])
hidden = Dense(1, activation = 'sigmoid')
conc_model = Sequential()
conc_model.add(merge)
conc_model.add(hidden)
conc_model.compile(...)
output = conc_model .fit(X_train, y_train, epochs=3, batch_size=64)
总而言之,最好使用Functional API。
编辑2 :这是最终代码
cnn_model = Sequential()
cnn_model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
cnn_model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
cnn_model.add(MaxPooling1D(pool_size=2))
cnn_model.add(Flatten())
lstm_model = Sequential()
lstm_model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
lstm_model.add(LSTM(64, activation = 'relu', return_sequences=True))
lstm_model.add(Flatten())
# instead of the last two lines you can also use
# lstm_model.add(LSTM(64, activation = 'relu'))
# then you do not have to use the Flatten layer. depends on your actual needs
merge = Concatenate([lstm_model, cnn_model])
hidden = Dense(1, activation = 'sigmoid')
conc_model = Sequential()
conc_model.add(merge)
conc_model.add(hidden)