我正在尝试使用Keras设计LSTM网络,以便在二进制分类设置中组合字嵌入和其他功能。我的测试集包含每班250个样本。
当我仅使用单词嵌入图层(代码中的“模型”图层)运行模型时,我得到的平均F1大约为0.67。当我创建一个具有固定大小的其他功能的新分支时,我单独计算(“branch2”)并使用“concat”将它们与单词embeddings合并,预测全部恢复为单个类(为该类提供完美的回忆) ,平均F1下降到0.33。
我是否错误地添加了功能和培训/测试?
def create_model(embedding_index, sequence_features, optimizer='rmsprop'):
# Branch 1: word embeddings
model = Sequential()
embedding_layer = create_embedding_matrix(embedding_index, word_index)
model.add(embedding_layer)
model.add(Convolution1D(nb_filter=32, filter_length=3, border_mode='same', activation='tanh'))
model.add(MaxPooling1D(pool_length=2))
model.add(Bidirectional(LSTM(100)))
model.add(Dropout(0.2))
model.add(Dense(2, activation='sigmoid'))
# Branch 2: other features
branch2 = Sequential()
dim = sequence_features.shape[1]
branch2.add(Dense(15, input_dim=dim, init='normal', activation='tanh'))
branch2.add(BatchNormalization())
# Merging branches to create final model
final_model = Sequential()
final_model.add(Merge([model,branch2], mode='concat'))
final_model.add(Dense(2, init='normal', activation='sigmoid'))
final_model.compile(loss='categorical_crossentropy', optimizer=optimizer,
metrics=['accuracy','precision','recall','fbeta_score','fmeasure'])
return final_model
def run(input_train, input_dev, input_test, text_col, label_col, resfile, embedding_index):
# Processing text and features
data_train, labels_train, data_test, labels_test = vectorize_text(input_train, input_test, text_col,label_col)
x_train, y_train = data_train, labels_train
x_test, y_test = data_test, labels_test
seq_train = get_sequence_features(input_train).as_matrix()
seq_test = get_sequence_features(input_test).as_matrix()
# Generating model
filepath = lstm_config.WEIGHTS_PATH
checkpoint = ModelCheckpoint(filepath, monitor='val_fmeasure', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
model = create_model(embedding_index, seq_train)
model.fit([x_train, seq_train], y_train, validation_split=0.33, nb_epoch=3, batch_size=100, callbacks=callbacks_list, verbose=1)
# Evaluating
scores = model.evaluate([x_test, seq_test], y_test, verbose=1)
time.sleep(0.2)
preds = model.predict_classes([x_test, seq_test])
preds = to_categorical(preds)
print(metrics.f1_score(y_true=y_test, y_pred=preds, average="micro"))
print(metrics.f1_score(y_true=y_test, y_pred=preds, average="macro"))
print(metrics.classification_report(y_test, preds))
输出:
使用Theano后端。找到2999999个单词向量。 处理文本数据集找到7165个独特的令牌。 数据张量的形状:(1996,50) 标签张量的形状:(1996,2) 1996年列车500测试 训练1337个样本,验证659个样本
大纪元1/3 1300/1337 [============================&gt ;.] - ETA:0s - 损失:0.6767 - acc: 0.6669 - 精度:0.5557 - 召回:0.6815 - fbeta_score:0.6120 - fmeasure:0.6120Epoch 00000:val_fmeasure im1337 / 1337 [==============================] - 10s - 损失:0.6772 - acc:0.6672 - 精度:0.5551 - 召回:0.6806 - fbeta_score:0.6113 - fmeasure: 0.6113 - val_loss:0.7442 - val_acc:0 .0000e + 00 - val_precision:0.0000e + 00 - val_recall:0.0000e + 00 - val_fbeta_score:0.0000e + 00 - val_fmeasure:0.0000e + 00
Epoch 2/3 1300/1337 [============================&gt ;.] - ETA:0s - 损失:0.6634 - acc: 0.7269 - 精度:0.5819 - 召回:0.7292 - fbeta_score:0.6462 - fmeasure:0.6462Epoch 00001:val_fmeasure di1337 / 1337 [==============================] - 9s - 损失:0.6634 - acc:0.7263 - 精度:0.5830 - 召回:0.7300 - fbeta_score:0.6472 - fmeasure: 0.6472 - val_loss:0.7616 - val_acc:0。0000e + 00 - val_precision:0.0000e + 00 - val_recall:0.0000e + 00 - val_fbeta_score:0.0000e + 00 - val_fmeasure:0.0000e + 00
大纪元3/3 1300/1337 [============================&gt ;.] - ETA:0s - 损失:0.6542 - acc: 0.7354 - 精度:0.5879 - 召回:0.7308 - fbeta_score:0.6508 - fmeasure:0.6508Epoch 00002:val_fmeasure di1337 / 1337 [==============================] - 8s - 损失:0.6545 - acc:0.7337 - 精度:0.5866 - 召回:0.7307 - fbeta_score:0.6500 - fmeasure: 0.6500 - val_loss:0.7801 - val_acc:0。0000e + 00 - val_precision:0.0000e + 00 - val_recall:0.0000e + 00 - val_fbeta_score:0.0000e + 00 - val_fmeasure:0.0000e + 00 500/500 [===== =========================] - 0s 500/500 [==============================] - 1s
0.5 /usr/local/lib/python3.4/dist-packages/sklearn/metrics/classification.py:1074:UndefinedMetricWarning:F-score定义不明确,设置为0.0 in 没有预测样品的标签。 '精确','预测',平均, warn_for) 0.333333333333 /usr/local/lib/python3.4/dist-packages/sklearn/metrics/classification.py:1074:
UndefinedMetricWarning:精确度和F分数定义不明确 在没有预测样本的标签中设置为0.0。
precision recall f1-score support 0 0.00 0.00 0.00 250 1 0.50 1.00 0.67 250 avg / total 0.25 0.50 0.33 500