我正在尝试编写一个使用GradientDescentOptimizer
的示例,但是优化很快就陷入了困境。我所有的数据都是根据公式y = (2 * x_1) + (8 * x_2)
生成的,因此,由于没有局部最小值,梯度下降难道不是很容易找到最佳解吗?
import numpy as np
import os
import random
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.logging.set_verbosity(tf.logging.ERROR)
np.random.seed(101)
tf.set_random_seed(101)
n_values = 100
learning_rate = 0.001
training_epochs = 1000
x_vals = np.random.random_sample((n_values, 2))
y_vals = [(2 * x_vals[i][0] + 8 * x_vals[i][1]) for i in range(n_values)]
y_vals = np.reshape(y_vals, (-1, 1))
n_dims = x_vals.shape[1]
X = tf.placeholder(tf.float32, [None, 2])
Y = tf.placeholder(tf.float32, [None, 1])
W = tf.Variable(tf.ones([1, n_dims]))
y_pred = tf.reduce_sum(tf.multiply(X, W), axis=(-1, 1))
cost = tf.reduce_sum(tf.pow(y_pred - Y, 2)) / (2 * tf.cast(n_values, tf.float32))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
sess.run(optimizer, feed_dict = {X : x_vals, Y : y_vals})
if (epoch) % 50 == 0:
c = sess.run(cost, feed_dict = {X : x_vals, Y : y_vals})
print("Epoch", (epoch + 1), ": cost =", c, "W =", sess.run(W))
这是结果
Epoch 1 : cost = 1048.1746 W = [[1.2004547 1.21069 ]]
Epoch 51 : cost = 429.50342 W = [[4.111497 4.421291]]
Epoch 101 : cost = 428.04016 W = [[4.170494 4.6341734]]
Epoch 151 : cost = 427.94107 W = [[4.1271544 4.6886673]]
Epoch 201 : cost = 427.90067 W = [[4.0954566 4.720226 ]]
Epoch 251 : cost = 427.88373 W = [[4.0747733 4.740489 ]]
Epoch 301 : cost = 427.87656 W = [[4.0613766 4.7535996]]
Epoch 351 : cost = 427.8736 W = [[4.0527034 4.762087 ]]
Epoch 401 : cost = 427.8724 W = [[4.0470877 4.767582 ]]
Epoch 451 : cost = 427.87186 W = [[4.043453 4.7711387]]
Epoch 501 : cost = 427.87167 W = [[4.0411 4.7734404]]
Epoch 551 : cost = 427.87155 W = [[4.039577 4.7749314]]
Epoch 601 : cost = 427.87146 W = [[4.0385904 4.775896 ]]
Epoch 651 : cost = 427.87152 W = [[4.0379524 4.7765207]]
Epoch 701 : cost = 427.87146 W = [[4.0375395 4.776925 ]]
Epoch 751 : cost = 427.87143 W = [[4.0372725 4.7771864]]
Epoch 801 : cost = 427.87146 W = [[4.0370994 4.7773557]]
Epoch 851 : cost = 427.8714 W = [[4.0369873 4.777465 ]]
Epoch 901 : cost = 427.87146 W = [[4.036914 4.7775364]]
Epoch 951 : cost = 427.87146 W = [[4.036866 4.777584]]
W
值仍在变化很小,但是如果我增加历元,W
值最终将根本停止变化。我可以更改学习率,但是所有这些都会导致它或早或晚陷入困境。
为什么GradientDescentOptimizer
不能为没有随机性的完美数据集找到解决方案?我的代码有问题吗?
答案 0 :(得分:0)
y_pred和Y的尺寸在下面的代码中应保持一致。但是y_pred是一维的,Y是二维的
y_pred = tf.reduce_sum(tf.multiply(X, W), axis=(-1, 1))
cost = tf.reduce_sum(tf.pow(y_pred - Y, 2)) / (2 * tf.cast(n_values, tf.float32))
您可以尝试下面的代码,它将产生预期的输出。
y_pred = tf.reshape(tf.reduce_sum(tf.multiply(X, W),axis=1),shape=(-1,1))