正确使用BERT变换器模型进行多类情感分析

时间:2021-02-19 06:10:42

标签: python one-hot-encoding bert-language-model transfer-learning transformer

我的输出标签采用以下格式进行单热编码: 正、负、混合、中性,带有 1 和 0,例如[1 0 0 0] 将文本表示为正面

我正在尝试使用 BERT 转换器模型进行训练并进行如下设置:

    transformer_name = "bert-base-uncased"
    pre_trained_model = TFBertForSequenceClassification.from_pretrained(transformer_name)
    tokenizer = BertTokenizer.from_pretrained(transformer_name)

    pre_trained_model.compile(loss='SparseCategoricalCrossentropy', optimizer='adam', metrics=['acc'])
    history = pre_trained_model.fit(X_train, y_train, epochs=5, validation_data=[X_valid, y_valid]  ,verbose=1)

请注意 X_train 和 X_valid 是我使用 BertTokenizer.from_pretrained(transformer_name) 标记化的文本

这样,我收到了以下错误:

logits and labels must have the same first dimension, got logits shape [32,2] and labels shape [128]
     [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-43-d1f017482d98>:7) ]] [Op:__inference_train_function_49712]

我的标签形状有什么问题,我在这里使用了错误的损失函数吗?

0 个答案:

没有答案