我正在使用预训练的单词向量(fasttext
),然后运行CNN
模型。我似乎遇到embedding input
和output layer
的形状不匹配的情况。我检查了this个类似的问题,但仍然不知道如何解决。
以下是我的
CNN
体系结构:
#CNN architecture
max_seq_len =150
print("training CNN ...")
model = Sequential()
model.add(Embedding(nb_words, embed_dim,
weights=[embedding_matrix], input_length=max_seq_len, trainable=False))
model.add(Conv1D(num_filters, 7, activation='relu', padding='same'))
model.add(MaxPooling1D(2))
model.add(Conv1D(num_filters, 7, activation='relu', padding='same'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Dense(num_classes, activation='sigmoid')) #multi-label (k-hot encoding)
adam = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
model.summary()
输出:
training CNN ...
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_3 (Embedding) (None, 150, 300) 2695800
_________________________________________________________________
conv1d_5 (Conv1D) (None, 150, 64) 134464
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 75, 64) 0
_________________________________________________________________
conv1d_6 (Conv1D) (None, 75, 64) 28736
_________________________________________________________________
global_max_pooling1d_3 (Glob (None, 64) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 64) 0
_________________________________________________________________
dense_5 (Dense) (None, 32) 2080
_________________________________________________________________
dense_6 (Dense) (None, 8) 264
=================================================================
Total params: 2,861,344
Trainable params: 165,544
Non-trainable params: 2,695,800
由于model.fit在输入和输出嵌入层不匹配上出现错误
#define callbacks
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0.01, patience=4, verbose=1)
callbacks_list = [early_stopping]
#model training
hist = model.fit(word_seq_train, y_train, batch_size=batch_size, epochs=num_epochs, callbacks=callbacks_list, validation_split=0.1, shuffle=True, verbose=2)
错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-1a4b3093afeb> in <module>()
3 callbacks_list = [early_stopping]
4 #model training
----> 5 hist = model.fit(word_seq_train, y_train, batch_size=batch_size, epochs=num_epochs, callbacks=callbacks_list, validation_split=0.1, shuffle=True, verbose=2)
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
143 ': expected ' + names[i] + ' to have shape ' +
144 str(shape) + ' but got array with shape ' +
--> 145 str(data_shape))
146 return data
147
ValueError: Error when checking input: expected embedding_3_input to have shape (150,) but got array with shape (74,)
一些其他信息:
print(word_seq_train.shape)
print(y_train.shape)
print(embedding_matrix.shape)
>>(1446, 74)
>>(1446,)
>>(8986, 300)
答案 0 :(得分:0)
来自@marco的评论:使用pad_sequences
是有效的
max_seq_len =150
word_seq_train = sequence.pad_sequences(word_seq_train, maxlen=max_seq_len)