我正在与Keras合作完成句子相似性任务(使用STS数据集),并且在合并图层时遇到问题。数据由1184个句子对组成,每个句子对在0到5之间得分。下面是我的numpy数组的形状。我已经将每个句子填充到50个单词并运行它们并嵌入图层,使用100个尺寸的手套嵌入。合并这两个网络时,我收到了错误..
Exception: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of 2 arrays:
这是我的代码看起来像
total training data = 1184
X1.shape = (1184, 50)
X2.shape = (1184, 50)
Y.shape = (1184, 1)
embedding_matrix = np.zeros((len(word_index) + 1, EMBEDDING_DIM))
for word, i in word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# words not found in embedding index will be all-zeros.
embedding_matrix[i] = embedding_vector
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=50,
trainable=False)
s1rnn = Sequential()
s1rnn.add(embedding_layer)
s1rnn.add(LSTM(128, input_shape=(100, 1)))
s1rnn.add(Dense(1))
s2rnn = Sequential()
s2rnn.add(embedding_layer)
s2rnn.add(LSTM(128, input_shape=(100, 1)))
s2rnn.add(Dense(1))
model = Sequential()
model.add(Merge([s1rnn,s2rnn],mode='concat'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='RMSprop', metrics=['accuracy'])
model.fit([X1,X2], Y,batch_size=32, nb_epoch=100, validation_split=0.05)
答案 0 :(得分:4)
问题不在于合并层。您需要创建两个嵌入层以提供2个不同的输入。
以下修改应该有效:
embedding_layer_1 = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=50,
trainable=False)
embedding_layer_2 = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=50,
trainable=False)
s1rnn = Sequential()
s1rnn.add(embedding_layer_1)
s1rnn.add(LSTM(128, input_shape=(100, 1)))
s1rnn.add(Dense(1))
s2rnn = Sequential()
s2rnn.add(embedding_layer_2)
s2rnn.add(LSTM(128, input_shape=(100, 1)))
s2rnn.add(Dense(1))