Question

我正在研究张量流理解中的一些细节。我不能为我的生活弄清楚为什么我的成本/损失在模型继续收敛的同时增加。我已经尝试了学习率，不同损失（一个被注释掉）等各种条件，并且无法理解成本增加的原因，但模型继续更新为正确的参数（W，b - 参见下面的输出）。工作中的每个人都说我的模型不是训练，但这里还有其他一些我不理解的东西。这已经存在于两个独立的机器和OS系统中

如果我将学习率发送到超低，则损失总是减少，但模型永远不会达到“目标”输出（可理解 - 低学习率，慢收敛）。我没有得到的是，即使我可以让模型覆盖甚至停止正确的参数，我的损失直到收敛不符合正确的行为 - 它应该稳定一旦W，b稳定（我可以得到它为了稳定W，b达到正确的值）并且如果损失增加，则W，b应该从'true'值移动到AWAY。

它似乎是数据传播的函数 - 对于noise_spread = 0.1，存在问题。对于噪声扩散0.5，它的表现要好得多。

import numpy as np
import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
rng = numpy.random
# Parameters
learning_rate = 0.0001
training_epochs = 1000
display_step = 50
# Generate synthetic data
N = 100
w_true = 5
b_true = 2
noise_scale = .1
x_np = np.random.rand(N, 1)
noise = np.random.normal(scale=noise_scale, size=(N, 1))
# Convert shape of y_np to (N,)
y_np = np.reshape(w_true * x_np  + b_true + noise, (-1))
train_X=x_np
train_Y =y_np
n_samples = train_X.shape[0]
# tf Graph Input# tf Gr 
X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Construct a linear model
pred = tf.add(tf.multiply(X, W), b)
# Mean squared error
#cost = tf.reduce_sum(tf.pow(pred-Y, 2))
cost = tf.losses.mean_squared_error(Y,pred)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start training# Start 
with tf.Session() as sess:
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        for (x, y) in zip(train_X, train_Y):
            sess.run(optimizer, feed_dict={X: x, Y: y})

        #Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), 
                "W=", sess.run(W), "b=", sess.run(b))

    print("Optimization Finished!")
    training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
    print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

    #Graphic display
    plt.plot(train_X, train_Y, 'ro', label='Original data')
    plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
    plt.legend()
    plt.show()

我的输出是：

Epoch: 0050 cost= 4.118685722 W= 2.9452877 b= 2.4476562
Epoch: 0100 cost= 3.944761038 W= 3.3132584 b= 2.7724605
Epoch: 0150 cost= 4.055146694 W= 3.511169 b= 2.8108914
Epoch: 0200 cost= 4.153135777 W= 3.6549997 b= 2.7731476
Epoch: 0250 cost= 4.238377571 W= 3.7775438 b= 2.7175744
Epoch: 0300 cost= 4.317091465 W= 3.888368 b= 2.6601844
Epoch: 0350 cost= 4.391473293 W= 3.9905539 b= 2.605233
Epoch: 0400 cost= 4.462153912 W= 4.0853305 b= 2.5536873
Epoch: 0450 cost= 4.529345512 W= 4.1734138 b= 2.5056264
Epoch: 0500 cost= 4.593110561 W= 4.255291 b= 2.4608996
Epoch: 0550 cost= 4.653543472 W= 4.331433 b= 2.4192963
Epoch: 0600 cost= 4.710726738 W= 4.402243 b= 2.3806055
Epoch: 0650 cost= 4.764744759 W= 4.468083 b= 2.344631
Epoch: 0700 cost= 4.815713882 W= 4.5293126 b= 2.311165
Epoch: 0750 cost= 4.863757610 W= 4.586264 b= 2.2800498
Epoch: 0800 cost= 4.908967018 W= 4.639206 b= 2.2511148
Epoch: 0850 cost= 4.951507092 W= 4.6884646 b= 2.224208
Epoch: 0900 cost= 4.991419315 W= 4.734207 b= 2.1991928
Epoch: 0950 cost= 5.028926373 W= 4.776786 b= 2.1759307
Epoch: 1000 cost= 5.064134598 W= 4.816405 b= 2.154287
Optimization Finished!
Training cost= 5.0641346 W= 4.816405 b= 2.154287

Tensorflow示例 - 损失增加但成功收敛

0 个答案: