Question

因此，我的网络输出是一个可用性列表，然后我使用tf.round（）将其舍入为0或1，这对于此项目至关重要。然后我发现tf.round是不可分的，所以我有点迷失了......：/

Answer 1

你可以使用tf.maximum（）和tf.minimum（）是可微分的这一事实，输入的概率从0到1

# round numbers less than 0.5 to zero;
# by making them negative and taking the maximum with 0
differentiable_round = tf.maximum(x-0.499,0)
# scale the remaining numbers (0 to 0.5) to greater than 1
# the other half (zeros) is not affected by multiplication
differentiable_round = differentiable_round * 10000
# take the minimum with 1
differentiable_round = tf.minimum(differentiable_round, 1)

示例：

[0.1,       0.5,     0.7]
[-0.0989, 0.001, 0.20099] # x - 0.499
[0,       0.001, 0.20099] # max(x-0.499, 0)
[0,          10,  2009.9] # max(x-0.499, 0) * 10000
[0,         1.0,     1.0] # min(max(x-0.499, 0) * 10000, 1)

Answer 2

舍入是一个根本上不可区分的功能，所以你在那里运气不好。这种情况的正常程序是找到一种方法来使用概率，比如通过使用它们来计算预期值，或者通过获取输出的最大概率并选择那个作为网络的预测。如果您没有使用输出来计算损失函数，那么您可以继续将其应用于结果，如果它是可微的则无关紧要。现在，如果你想要一个信息性的损失功能来培训网络，也许你应该考虑保持输出的概率格式实际上可能对你有利（它可能会使你的训练过程更顺畅） - 这样你可以在训练后将概率转换为网络外的实际估计值。

Answer 3

这对我有用：

x_rounded_NOT_differentiable = tf.round(x)
x_rounded_differentiable = (x - (tf.stop_gradient(x) - x_rounded_NOT_differentiable))

Answer 4

x - sin（2pi x）/（2pi）的某些东西？

我确信有一种方法可以将斜坡压得有点陡峭。

Answer 5

在范围0 1中，转换和缩放S形可能是一种解决方案：

  slope = 1000
  center = 0.5
  e = tf.exp(slope*(x-center))
  round_diff = e/(e+1)

Answer 6

基于先前的答案，一种获得任意良好近似值的方法是使用有限傅立叶近似值来近似round()并根据需要使用任意多个项。从根本上讲，您可以将round(x)视为向x添加反向（即下降）锯齿波。因此，使用锯齿波的傅立叶展开，我们得到

$round(x) ≈ x + 1/π ∑_n^N (-1)^n sin(2π n x)/n$

在 N = 5的情况下，我们得到一个很好的近似值：

Answer 7

一个老问题，但我刚刚为 TensorFlow 2.0 解决了这个问题。我在我的音频自动编码器项目中使用以下轮函数。我基本上想创建一个在时间上被压缩的声音的离散表示。我使用 round 函数将编码器的输出限制为整数值。到目前为止，它对我来说效果很好。

@tf.custom_gradient
def round_with_gradients(x):
    def grad(dy):
        return dy
    return tf.round(x), grad

Tensorflow中可区分的圆函数？

7 个答案: