损失随机增加和减少。我应该把损失放在哪里?

时间:2019-12-21 17:21:26

标签: deep-learning pytorch loss transformer

我正在使用PyTorch实现依赖项解析模型,并且对我在下面解释的情况有些困惑。 在计算损失并向后追溯模型时;我尝试了不同的事情。

  • 当我完全使用下面的代码时,将批次大小设置为1(迭代中为1批次):
    • 损失看起来似乎正在减少,但是在20个时期后预测并没有得到很好的体现。
  • 当我完全使用下面的代码,并使批大小为100(0)时:
    • 我收到一个错误:RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
    • 如错误消息所述,当我使用inside_loss.backward(retain_graph = True)而不是inside_loss.backward()时,执行时间太长。而且损失是随机地增加减少
  • 当我注释掉inner_loss并取消注释for循环之后的行时:
    • 损失没有改变。

代码在这里:

def forward(self, x,):
# x is the scores of my vocab. It has grad so cannot change its values.
# So take the clone
x_prime = x.clone()

# loss = Variable(torch.zeros(1), requires_grad=True)
loss_v = torch.zeros(1)

for i in range(x_prime.size(0)):

    # Some operations that changes x_prime's values
    # Calculate sentence_probs and sentence_scores from x_prime's values

    eisner_values = eisner_torch(sentence_probs)

    # Changed x_prime above; so update the x
    x = x_prime

    # Get the gold dependencies
    gold_deps = return_gold_deps(i, sentence_to_dependencies)

    if gold_deps is None:
        continue

    mask = np.greater(np.asarray(eisner_values), -1)

    # Calculate hinge loss
    inside_loss = hinge(sentence_scores, eisner_values, gold_deps, mask, 1)

    # Calculate total loss in the batch
    loss_v += inside_loss.data

    inside_loss.backward()

    # Optimizer step
    if self.opt is not None:
        self.opt.step()
        self.opt.optimizer.zero_grad()

# loss.data = loss_v.data
# loss.backward()
# Optimizer Step
return loss_v

我将Adam优化器用于此任务:

model_opt = NoamOpt(model_size=d_model, factor=1, warmup=200,
                    torch.optim.Adam(model.parameters(), lr=0, betas=(0.9, 0.98), eps=1e-9))

这里有什么问题? 我该如何解决这个问题?

谢谢。

0 个答案:

没有答案