我为线性回归编码tensorflow程序。我正在使用Gradient Descent算法来优化(最小化)损失函数。但是在执行程序时损失函数的值正在增加。我的程序和输出如下。
import tensorflow as tf
W = tf.Variable([.3],dtype=tf.float32)
b = tf.Variable([-.3],dtype=tf.float32)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
sess = tf.Session()
init = init = tf.global_variables_initializer()
sess.run(init)
lm = W*X + b
delta = tf.square(lm-Y)
loss = tf.reduce_sum(delta)
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
for i in range(8):
print(sess.run([W, b]))
print("loss= %f" %sess.run(loss,{X:[10,20,30,40],Y:[1,2,3,4]}))
sess.run(train, {X: [10,20,30,40],Y: [1,2,3,4]})
sess.close()
我的程序的输出是
2017-12-07 14:50:10.517685: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
[array([ 0.30000001], dtype=float32), array([-0.30000001],dtype=float32)]
loss= 108.359993
[array([-11.09999943], dtype=float32), array([-0.676], dtype=float32)]
loss= 377836.000000
[array([ 662.25195312], dtype=float32), array([ 21.77807617], dtype=float32)]
loss= 1318221568.000000
[array([-39110.421875], dtype=float32), array([-1304.26794434], dtype=float32)]
loss= 4599107289088.000000
[array([ 2310129.25], dtype=float32), array([ 77021.109375], dtype=float32)]
loss= 16045701465112576.000000
[array([ -1.36451664e+08], dtype=float32), array([-4549399.], dtype=float32)]
loss= 55981405829796462592.000000
[array([ 8.05974733e+09], dtype=float32), array([ 2.68717856e+08], dtype=float32)]
loss= 195312036582209632600064.000000
请给我一个答案,说明为什么损失的价值在增加而不是减少。
答案 0 :(得分:1)
您是否尝试过更改学习率?使用较低的运行速率(~1e-4)和更多的迭代应该可行。
更多理由说明为什么可能需要较低的学习率。请注意,您的损失函数是
L = \ sum(Wx + b-Y)^ 2
和dL / dW = \ sum 2(Wx + b-Y)* x
和hessian d ^ 2L / d ^ 2W = \ sum 2x * x
现在,你的损失是分歧的,因为学习率高于粗麻布的倒数,大约是1 /(2 * 2900)。所以你应该尝试降低学习率。
注意:我不确定如何向StackOverflow回答添加数学,所以我必须以这种方式添加它。
答案 1 :(得分:0)
要进行线性回归,这是我一直在使用numpy的代码:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd
print(tf.__version__)
%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 6)
x = np.arange(start=0.0, stop=5.0, step=0.1)
##You can adjust the slope and intercept to verify the changes in the graph
W=1
b=0
# We define de linear ecuation
y= W*x + b
# And plot it thanks to matplotlib
plt.plot(x,y)
plt.ylabel('Dependent Variable')
plt.xlabel('Indepdendent Variable')
plt.show()
答案 2 :(得分:0)
使用TensorFlow可以使用类似于下面的代码进行线性回归:
def graph_formula_vs_data(formula, x_vector, y_vector):
"""
This function graphs a formula in the form of a line, vs. data points
"""
x = np.array(range(0, int(max(x_vector))))
y = eval(formula)
plt.plot(x, y)
plt.plot(x_vector, y_vector, "ro")
plt.show()
df=pd.read_csv('./linear_reg_exam_dataset.csv',usecols = [0,1],skiprows = [0],header=None)
d = df.values
data = np.float32(d)
dataset = pd.DataFrame({'x': data[:, 0], 'y': data[:, 1]})
# Number of epochs (times we make the model go through all the data)
n_epochs = 100
# Model parameters
W = tf.Variable([0.], tf.float32)
b = tf.Variable([0.], tf.float32)
y = dataset['y'] # define the target variable (dependent variable) as y
x = dataset['x']
msk = np.random.rand(len(df)) < 0.8
# Model input and output
x_train = x[msk].values.tolist()
y_train = y[msk].values.tolist()
# Validation data (with this we validate that the model has learned to generalize the problem)
x_val = x[~msk].values.tolist()
y_val = y[~msk].values.tolist()
# Model definition
@tf.function
def linear_model(x, W, b):
return W*x + b
# Cost function
loss = lambda: tf.reduce_sum(tf.math.squared_difference(y_train,linear_model(x_train, W, b)))
# optimizer to do the gradient descent
optimizer = tf.optimizers.SGD(0.0000000000001)
# We perform n_epochs training iterations
for i in range(n_epochs):
optimizer.minimize(loss, var_list=[W, b])
# Every 10 epochs we print the data of how W, b evolve and the amount of error there is
if i % 10 == 0 or i == n_epochs-1:
print("Epoch {}".format(i))
print("W: {}".format(W.numpy()))
print("b: {}".format(b.numpy()))
print("loss: {}".format(loss()))
# This formula represents w * x + b in string form to be able to graph it
stringfied_formula=str(W.numpy()) + "*x +" + str(b.numpy())
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_train, y_vector=y_train)
print("\n")
第99集 W:[0.39189553] b:[0.00059491] 损失:1458421628928.0
# Evaluation of the model with validation data
stringfied_formula=str(W.numpy()) + "*x +" + str(b.numpy())
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_val, y_vector=y_val)
loss = lambda: tf.reduce_sum(tf.math.squared_difference(y_val,linear_model(x_val, W, b)))
print("\nValidation: ")
print("W: {}".format(W.numpy()))
print("b: {}".format(b.numpy()))
print("loss: {}".format(loss()))
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_val, y_vector=y_val)
验证: W:[75.017586] b:[0.11139687] 损失:8863.4775390625