使用LSTM模型进行文本分类的准确性较低

时间:2020-07-14 23:17:45

标签: python keras neural-network lstm text-classification

我正在尝试使用数据集的标题和文本特征来训练用于伪造新闻检测的LSTM模型。下面是我的模型的代码:

    vocab_size = len(tokenizer.word_index) + 1 #gives me a value of 12
    embedding_dim = 50
    maxlen = 50
    model = Sequential()
    model.add(layers.Embedding(vocab_size, embedding_dim, input_length=maxlen))
    model.add(layers.LSTM(128,activation = "relu"))
    model.add(layers.Dense(256, activation = 'relu'))
    model.add(layers.Dropout(0.3))
    model.add(layers.Dense(1, activation = 'softmax'))

    model.compile(optimizer = opt,
          loss='binary_crossentropy',
          metrics=['accuracy'])
    model.summary()
    model_train = model.fit(X_train_cnn, y_train2,
                epochs = 15,
                verbose = True,
                validation_data=(X_test_cnn, y_test2),
                batch_size = 32)

训练和验证数据的准确性均为47%左右

Epoch 15/15
30081/30081 [==============================] - 60s 2ms/step - loss: 7.9900 - accuracy: 0.4789 - val_loss: 8.0780 - val_accuracy: 0.4732

混乱矩阵:

array([[   0, 7806],
       [   0, 7011]], dtype=int64)

分类报告:

precision    recall  f1-score   support

           0       0.00      0.00      0.00      7806
           1       0.47      1.00      0.64      7011

    accuracy                           0.47     14817

我尝试了不同的纪元,batch_size,具有不同单位的2个LSTM层的组合,但是没有运气。请帮助我。

0 个答案:

没有答案