我正在运行以下代码:
public static void main(String[] args)
{
utskrift("tekst");
}
public static void utskrift(String tekst)
{
System.out.println(tekst);
}
public static void utskrift1(double tall1, double tall2)
{
System.out.println(tall1 + tall2);
}
}
,结果如下
import tensorflow as tf
# data set
x_data = [10., 20., 30., 40.]
y_data = [20., 40., 60., 80.]
# try to find values for w and b that compute y_data = W * x_data + b
# range is -100 ~ 100
W = tf.Variable(tf.random_uniform([1], -1000., 1000.))
b = tf.Variable(tf.random_uniform([1], -1000., 1000.))
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
# my hypothesis
hypothesis = W * X + b
# Simplified cost function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# minimize
a = tf.Variable(0.1) # learning rate, alpha
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost) # goal is minimize cost
# before starting, initialize the variables
init = tf.initialize_all_variables()
# launch
sess = tf.Session()
sess.run(init)
# fit the line
for step in xrange(2001):
sess.run(train, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
print step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W), sess.run(b)
print sess.run(hypothesis, feed_dict={X: 5})
print sess.run(hypothesis, feed_dict={X: 2.5})
我不明白为什么这个结果是0 1.60368e+10 [ 4612.54003906] [ 406.81304932]
100 nan [ nan] [ nan]
200 nan [ nan] [ nan]
300 nan [ nan] [ nan]
400 nan [ nan] [ nan]
500 nan [ nan] [ nan]
600 nan [ nan] [ nan]
700 nan [ nan] [ nan]
800 nan [ nan] [ nan]
900 nan [ nan] [ nan]
1000 nan [ nan] [ nan]
1100 nan [ nan] [ nan]
1200 nan [ nan] [ nan]
1300 nan [ nan] [ nan]
1400 nan [ nan] [ nan]
1500 nan [ nan] [ nan]
1600 nan [ nan] [ nan]
1700 nan [ nan] [ nan]
1800 nan [ nan] [ nan]
1900 nan [ nan] [ nan]
2000 nan [ nan] [ nan]
[ nan]
[ nan]
?
如果我将初始数据更改为此
nan
然后它没有问题。那是为什么?
答案 0 :(得分:4)
你正在溢出float32,因为你的问题的学习率太高了,而不是收敛权重变量(W
)在梯度下降的每一步上向更大和更大的振幅振荡。
如果你改变了
a = tf.Variable(0.1)
到
a = tf.Variable(0.001)
权重应该更好地收敛。您可能希望增加迭代次数(至~50000)。
在实施或使用机器学习算法时,选择良好的学习率通常是第一个挑战。获得增加的损失值而不是收敛到最小值通常表明学习率太高。
在您的情况下,当您在训练数据中使用较大的量值时,适合线的特定问题更容易受到重量分歧的影响。这就是为什么在训练之前将数据标准化是常见的一个原因。神经网络。
此外,您的起始重量和偏差的范围非常大,这意味着它们距离理想值非常远,并且在开始时具有非常大的损耗值和梯度。在查看更高级的学习算法时,选择适当的初始值范围是另一个重要的事情。