Question

我正在尝试建立一个模型来预测推文是否与灾难有关。 Link for the question's description.

我尝试了很多方法，但验证准确性保持不变。似乎我的模型在 1 个 epoch 后就过度拟合了。我的验证损失正在以非常高的速度增加。我在这一点上没有动作。如果您能查看提供的笔记本并给我一些重点关注点，那就太好了，谢谢。

我的模特

  def forward(self, x, hidden):
    batch=len(x)
    embedd=self.embedding(x)
    lstm_out1, hidden=self.lstm1(embedd, hidden)
    lstm_out=lstm_out1.contiguous().view(-1, self.hidden_dim)
    out=self.dropout(F.relu(lstm_out))
    out=self.fc1(out)
    out=self.dropout(F.relu(out))
    out=self.fc2(out)
    out=self.dropout(F.relu(out))
    out=self.fc3(out)
    sig_out=self.sigmoid(out)
    sig_out=sig_out.view(batch, -1)
    sig_out=sig_out[:, -1]
    return sig_out, hidden

我的参数

learning_rate=0.001
epochs=15
vocab_size = len(vocab_to_int)+1 # +1 for the 0 padding = 29364
output_size = 2 # The output size we want
embedding_dim = 50 # output of embedding layer
hidden_dim = 32 # the output size of 2 outputs of lstm cell
n_layers = 2
batch_size = 23
net=RealOrFakeLSTM(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, False, 0.3)
net.to(device)
criterion=nn.BCELoss()
optimizer=torch.optim.Adam(net.parameters(),lr=learning_rate)
net.train()

最后是我的训练代码。

for i in range(epochs):
  hidden=net.init_hidden(batch_size)
  for input,label in train_loader:
    input=input.to(device)
    label=label.to(device)
    optimizer.zero_grad()
    input=input.clone().detach().long()
    out, hidden=net(input, hidden)
    loss_train=criterion(out.squeeze(),label.float())
    loss_train_arr=np.append(loss_train_arr,loss_train.cpu().detach().numpy())
    loss_train.backward()
    optimizer.step()
    total_train_loss+=loss_train
  total_train_loss=total_train_loss/len(train_loader)
  lossPerEpoch_train=np.append(lossPerEpoch_train,total_train_loss.cpu().detach().numpy())

验证损失以非常高的速度不断增加，并且验证准确度是恒定的

0 个答案: