Question

我正在尝试实现以下loss function

对我来说，最直接的forword实现将使用torch.max

losses = torch.max(ap_distances - an_distances + margin, torch.Tensor([0]))

但是，我看到其他implementations on github正在使用F.relu

losses = F.relu(ap_distances - an_distances + margin)

它们给出基本相同的输出，但是我想知道两种方法之间是否有根本区别。

Answer 1

torch.max根据此discussion不可区分。损失函数需要是连续的并且可微分以进行反向传播。 relu是可微的，因为它可以近似，因此可以在损失函数中使用它。