Question

我为线性回归编码tensorflow程序。我正在使用Gradient Descent算法来优化（最小化）损失函数。但是在执行程序时损失函数的值正在增加。我的程序和输出如下。

    import tensorflow as tf
    W = tf.Variable([.3],dtype=tf.float32)
    b = tf.Variable([-.3],dtype=tf.float32)
    X = tf.placeholder(tf.float32)
    Y = tf.placeholder(tf.float32)
    sess = tf.Session()
    init = init = tf.global_variables_initializer()
   sess.run(init)
   lm = W*X + b
   delta = tf.square(lm-Y)
   loss = tf.reduce_sum(delta)
   optimizer = tf.train.GradientDescentOptimizer(0.01)
   train = optimizer.minimize(loss)
   for i in range(8):
      print(sess.run([W, b]))
      print("loss= %f" %sess.run(loss,{X:[10,20,30,40],Y:[1,2,3,4]}))  
      sess.run(train, {X: [10,20,30,40],Y: [1,2,3,4]})
   sess.close()

我的程序的输出是

2017-12-07 14:50:10.517685: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

[array([ 0.30000001], dtype=float32), array([-0.30000001],dtype=float32)]
loss= 108.359993

[array([-11.09999943], dtype=float32), array([-0.676], dtype=float32)]
loss= 377836.000000

[array([ 662.25195312], dtype=float32), array([ 21.77807617],  dtype=float32)]
loss= 1318221568.000000

[array([-39110.421875], dtype=float32), array([-1304.26794434],  dtype=float32)]
loss= 4599107289088.000000

[array([ 2310129.25], dtype=float32), array([ 77021.109375],  dtype=float32)]
loss= 16045701465112576.000000
[array([ -1.36451664e+08], dtype=float32), array([-4549399.],  dtype=float32)]
loss= 55981405829796462592.000000

[array([  8.05974733e+09], dtype=float32), array([  2.68717856e+08],  dtype=float32)]
loss= 195312036582209632600064.000000

请给我一个答案，说明为什么损失的价值在增加而不是减少。

Answer 1

您是否尝试过更改学习率？使用较低的运行速率（~1e-4）和更多的迭代应该可行。

更多理由说明为什么可能需要较低的学习率。请注意，您的损失函数是

L = \ sum（Wx + b-Y）^ 2

和dL / dW = \ sum 2（Wx + b-Y）* x

和hessian d ^ 2L / d ^ 2W = \ sum 2x * x

现在，你的损失是分歧的，因为学习率高于粗麻布的倒数，大约是1 /（2 * 2900）。所以你应该尝试降低学习率。

注意：我不确定如何向StackOverflow回答添加数学，所以我必须以这种方式添加它。

Answer 2

要进行线性回归，这是我一直在使用numpy的代码：

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd
print(tf.__version__) 

%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 6)

x = np.arange(start=0.0, stop=5.0, step=0.1)

##You can adjust the slope and intercept to verify the changes in the graph
W=1
b=0

# We define de linear ecuation
y= W*x + b 

# And plot it thanks to matplotlib
plt.plot(x,y) 
plt.ylabel('Dependent Variable')
plt.xlabel('Indepdendent Variable')
plt.show()

Answer 3

使用TensorFlow可以使用类似于下面的代码进行线性回归：

    def graph_formula_vs_data(formula, x_vector, y_vector): 
        """
        This function graphs a formula in the form of a line, vs. data points
        """
        x = np.array(range(0, int(max(x_vector))))  
        y = eval(formula)
        plt.plot(x, y)
        plt.plot(x_vector, y_vector, "ro")
        plt.show()

df=pd.read_csv('./linear_reg_exam_dataset.csv',usecols = [0,1],skiprows = [0],header=None)
d = df.values
data = np.float32(d)

dataset = pd.DataFrame({'x': data[:, 0], 'y': data[:, 1]})

# Number of epochs (times we make the model go through all the data)
n_epochs = 100

# Model parameters
W = tf.Variable([0.], tf.float32)
b = tf.Variable([0.], tf.float32)

y = dataset['y'] # define the target variable (dependent variable) as y
x = dataset['x']
msk = np.random.rand(len(df)) < 0.8

# Model input and output
x_train = x[msk].values.tolist()
y_train = y[msk].values.tolist()

# Validation data (with this we validate that the model has learned to generalize the problem)
x_val = x[~msk].values.tolist()
y_val = y[~msk].values.tolist()


# Model definition
@tf.function
def linear_model(x, W, b):
    return W*x + b


# Cost function
loss = lambda: tf.reduce_sum(tf.math.squared_difference(y_train,linear_model(x_train, W, b)))
# optimizer to do the gradient descent
optimizer = tf.optimizers.SGD(0.0000000000001)

# We perform n_epochs training iterations
for i in range(n_epochs):
    optimizer.minimize(loss, var_list=[W, b])

    # Every 10 epochs we print the data of how W, b evolve and the amount of error there is
    if i % 10 == 0 or i == n_epochs-1:
        print("Epoch {}".format(i))
        print("W: {}".format(W.numpy()))
        print("b: {}".format(b.numpy()))
        print("loss: {}".format(loss()))
        # This formula represents w * x + b in string form to be able to graph it
        stringfied_formula=str(W.numpy()) + "*x +" + str(b.numpy())
        graph_formula_vs_data(formula=stringfied_formula, x_vector=x_train, y_vector=y_train)
        print("\n")

第99集 W：[0.39189553] b：[0.00059491] 损失：1458421628928.0

# Evaluation of the model with validation data
stringfied_formula=str(W.numpy()) + "*x +" + str(b.numpy())
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_val, y_vector=y_val)
loss = lambda: tf.reduce_sum(tf.math.squared_difference(y_val,linear_model(x_val, W, b)))
print("\nValidation: ")
print("W: {}".format(W.numpy()))
print("b: {}".format(b.numpy()))
print("loss: {}".format(loss()))
graph_formula_vs_data(formula=stringfied_formula, x_vector=x_val, y_vector=y_val)

验证： W：[75.017586] b：[0.11139687] 损失：8863.4775390625

张量流 - 线性回归

3 个答案: