Question

我正在尝试实施论文[1]中介绍的边际损失。到目前为止，这是我所做的。

def marginal_loss(model1, model2, y, margin, threshold):
    margin_ = 1/(tf.pow(margin,2)-margin)
    tmp = (1. - y)
    euc_dist = tf.sqrt(tf.reduce_sum(tf.pow(model1-model2, 2), 1, keep_dims=True))
    thres_dist = threshold - euc_dist
    mul_val = tf.multiply(tmp, thres_dist)
    sum_ = tf.reduce_sum(mul_val)
    return tf.multiply(margin_, sum_)

但是，在某些时期之后，该值变为nan。我不确定我犯了什么错误。此外，我用1代替了epsilon（如本文所述），因为它的值不清楚。类似地，确切的阈值也是未知的。

感谢您的帮助。

[1] https://ibug.doc.ic.ac.uk/media/uploads/documents/deng_marginal_loss_for_cvpr_2017_paper.pdf

Answer 1

这看起来与this other question引起的问题非常相似。问题可能来自使用tf.sqrt，它的坏特性是当您接近零时渐变会变为无穷大，从而在模型收敛时带来不稳定性。

例如通过最小化当前损失的平方来尝试摆脱损失中的tf.sqrt。

或者，您可以依赖现有的内置函数，例如tf.losses.hinge_loss（不过不适用于多维输出）。

如何实施边际损失？

1 个答案: