为什么我的线性回归得到纳米值而不是学习?

时间:2016-09-04 08:23:59

标签: python tensorflow

我正在运行以下代码:

public static void main(String[] args)
  {
    utskrift("tekst");
  }
  public static void utskrift(String tekst)
  {
    System.out.println(tekst);
  }
  public static void utskrift1(double tall1, double tall2)
  {
    System.out.println(tall1 + tall2);
  }
}

,结果如下

import tensorflow as tf

# data set
x_data = [10., 20., 30., 40.]
y_data = [20., 40., 60., 80.]

# try to find values for w and b that compute y_data = W * x_data + b
# range is -100 ~ 100
W = tf.Variable(tf.random_uniform([1], -1000., 1000.))
b = tf.Variable(tf.random_uniform([1], -1000., 1000.))

X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# my hypothesis
hypothesis = W * X + b

# Simplified cost function
cost = tf.reduce_mean(tf.square(hypothesis - Y))

# minimize
a = tf.Variable(0.1)  # learning rate, alpha
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)  # goal is minimize cost

# before starting, initialize the variables
init = tf.initialize_all_variables()

# launch
sess = tf.Session()
sess.run(init)

# fit the line
for step in xrange(2001):
    sess.run(train, feed_dict={X: x_data, Y: y_data})
    if step % 100 == 0:
        print step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W), sess.run(b)

print sess.run(hypothesis, feed_dict={X: 5})
print sess.run(hypothesis, feed_dict={X: 2.5})

我不明白为什么这个结果是0 1.60368e+10 [ 4612.54003906] [ 406.81304932] 100 nan [ nan] [ nan] 200 nan [ nan] [ nan] 300 nan [ nan] [ nan] 400 nan [ nan] [ nan] 500 nan [ nan] [ nan] 600 nan [ nan] [ nan] 700 nan [ nan] [ nan] 800 nan [ nan] [ nan] 900 nan [ nan] [ nan] 1000 nan [ nan] [ nan] 1100 nan [ nan] [ nan] 1200 nan [ nan] [ nan] 1300 nan [ nan] [ nan] 1400 nan [ nan] [ nan] 1500 nan [ nan] [ nan] 1600 nan [ nan] [ nan] 1700 nan [ nan] [ nan] 1800 nan [ nan] [ nan] 1900 nan [ nan] [ nan] 2000 nan [ nan] [ nan] [ nan] [ nan]

如果我将初始数据更改为此

nan

然后它没有问题。那是为什么?

1 个答案:

答案 0 :(得分:4)

你正在溢出float32,因为你的问题的学习率太高了,而不是收敛权重变量(W)在梯度下降的每一步上向更大和更大的振幅振荡。

如果你改变了

a = tf.Variable(0.1)

a = tf.Variable(0.001)

权重应该更好地收敛。您可能希望增加迭代次数(至~50000)。

在实施或使用机器学习算法时,选择良好的学习率通常是第一个挑战。获得增加的损失值而不是收敛到最小值通常表明学习率太高。

在您的情况下,当您在训练数据中使用较大的量值时,适合线的特定问题更容易受到重量分歧的影响。这就是为什么在训练之前将数据标准化是常见的一个原因。神经网络。

此外,您的起始重量和偏差的范围非常大,这意味着它们距离理想值非常远,并且在开始时具有非常大的损耗值和梯度。在查看更高级的学习算法时,选择适当的初始值范围是另一个重要的事情。