Question

我正在使用 BERT 模型训练评论数据集，并使用 2 个密集层进行微调。然而，训练损失并没有随着时代的增加而减少。下面是模型架构和训练代码的代码：

  bert = BertModel.from_pretrained("bert-base-uncased",num_labels = len(label_dict),output_attentions = False, output_hidden_states = False)
  
  # freeze all the parameters
  for param in bert.parameters():
     param.requires_grad = False

  class bertModel(nn.Module):
     def __init__(self, bert):
         super(bertModel, self).__init__()
         self.bert = bert
         self.dropout1 = nn.Dropout(0.1)
         self.relu =  nn.ReLU()
         self.fc1 = nn.Linear(self.bert.config.hidden_size, 512)
         self.fc2 = nn.Linear(512, 2)
         self.softmax = nn.LogSoftmax(dim = 1)

     def forward(self, **inputs):
        _, x = self.bert(**inputs)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout1(x)
        x = self.fc2(x)
        x = self.softmax(x)

    return x

我使用了批量大小为 256 的 adam 优化器，lr = 0.00001。下面的链接有完整的代码。 https://github.com/gprashmi/Sentiment_Analysis/blob/main/sentiment_analysis_modelling1.ipynb

谁能帮我看看如何减少损失？

更新第 1 期的样本训练损失：

 Epoch: 1 

 Batch: 0, Training Loss: 0.046884429454803464
 Batch: 25, Training Loss: 1.6009380578994752
 Batch: 50, Training Loss: 1.6312992066144942
 Batch: 75, Training Loss: 1.6702612936496735
 Batch: 100, Training Loss: 1.614540034532547
 Batch: 125, Training Loss: 1.6372735381126404
 End of Epoch 1, Avg. Training Loss: 7.984550286616598, Avg. validation Loss: 15.047848328666866

HuggingFace BERT 模型在训练期间的训练损失没有减少

0 个答案: