以下简化代码在x = 0时输出微分的nan。我正在运行tensorflow 2.0.0。
import tensorflow as tf
x = tf.Variable([[-1.0], [0.0], [1.0]])
with tf.GradientTape(persistent=True) as t:
t.watch(x)
# case 1: y = x^4
# y = tf.reduce_sum(tf.pow(x, 4), axis=1) # gives nan for 2nd to 5th derivative at x=0
# case 2: y = x + x^2 + x^3 + x^4
y = tf.reduce_sum(tf.pow(x, [[1, 2, 3, 4]]), axis=1) # gives nan for 2nd to 5th derivative at x=0
dy_dx = t.gradient(y, x)
d2y_dx2 = t.gradient(dy_dx, x)
d3y_dx3 = t.gradient(d2y_dx2, x)
d4y_dx4 = t.gradient(d3y_dx3, x)
d5y_dx5 = t.gradient(d4y_dx4, x)
del t
tf.print(y)
tf.print(tf.transpose(dy_dx)) # transpose only to fit on one line when printed
tf.print(tf.transpose(d2y_dx2))
tf.print(tf.transpose(d3y_dx3))
tf.print(tf.transpose(d4y_dx4))
tf.print(tf.transpose(d5y_dx5))
这会输出正确的值,除非x = 0:
[0 0 4]
[[-2 1 10]]
[[8 -nan(ind) 20]]
[[-18 -nan(ind) 30]]
[[24 -nan(ind) 24]]
[[0 -nan(ind) 0]]
如果您运行tf.pow(x, 4)
情况,则nan仅显示5阶导数:
[1 0 1]
[[-4 0 4]]
[[12 0 12]]
[[-24 0 24]]
[[24 24 24]]
[[-0 -nan(ind) 0]]
所以我的问题是:
tensorflow文档没有明确指出pow函数支持两个大小不同的参数,但是第一个输出y是正确的。有人对此有经验吗?我期望将所有3个输入x
值提高到所有4个幂的矩阵。
从渐变返回的nan值是我应该报告的错误吗?我确实找到了先前可能相关的问题,但已解决:https://github.com/tensorflow/tfjs/issues/346