我开始学习Tensorflow并将其作为练习来寻找函数的零点。我采用的方法如下:
x = tf.Variable(0.0, trainable=True) # Independent variable
y = 2*tf.pow(x,2) - 6*x + 4 # Function for which to find 0's
loss = tf.pow(y,2) # Function with minima at 0 of y(x)
opt = tf.train.GradientDescentOptimizer(0.1).minimize(loss) # Optimizer
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(1000): # Minimizing loop
print sess.run([x,y, loss])
sess.run(opt)
我希望找到损失的最小值(即y ^ 2)会给我y的零。但是,当我尝试上面的代码时,我得到了以下结果:
[0.0, 4.0, 16.0]
[4.8, 21.280005, 452.83856]
[-51.37921, 5591.9224, 31269596.0]
[236505.78, 111868550000.0, 1.2514572e+22]
[-2.1165915e+16, 8.959919e+32, inf]
[inf, nan, nan]
[nan, nan, nan]
我做错了什么?我期待“发现” y在x = 1处为零。
答案 0 :(得分:0)
学习率太大。较小的学习率(例如0.01)会产生以下结果:
[0.0, 4.0, 16.0]
[0.48, 1.5808, 2.4989288]
[0.6089933, 1.087786, 1.1832783]
[0.68653125, 0.82346296, 0.6780912]
[0.7401202, 0.65483475, 0.42880854]
[0.77992785, 0.5370076, 0.28837714]
[0.8108626, 0.44982052, 0.2023385]
[0.83566165, 0.3826909, 0.14645232]
[0.85600054, 0.3294704, 0.10855074]
[0.8729749, 0.28632116, 0.08197981]
[0.8873373, 0.25071096, 0.06285599]
[0.8996254, 0.2208991, 0.048796415]
[0.9102352, 0.19564486, 0.03827691]
[0.91946596, 0.17403984, 0.030289866]
[0.9275488, 0.15540075, 0.024149394]
[0.93466556, 0.13920641, 0.019378424]
[0.9409614, 0.12504816, 0.015637042]
[0.94655395, 0.112605095, 0.0126799075]
[0.95153964, 0.101617336, 0.010326083]
[0.9559983, 0.09187603, 0.008441205]
这将收敛到1。
更新:描述原始代码的发散现象。
如您所见,结果的x坐标被解释为围绕最优解x = 1的振荡。x坐标越远,产生的梯度越大。最后,损失超过tf.float32
可以表达的最大值,导致inf
。