为了更好地理解张量流,我创建了一个简单的梯度下降场景。当我运行它时,使用"猜测"要手动计算损失值,我没有得到与tensorflow吐出相同的值,我也不明白为什么。
我编写的程序使用梯度下降来计算矩阵运算QMqttCLient
的结果,其中f(X*A) * B
是sigmoid函数,其中f()
(1xn)是占位符/输入值,X
(nxn)和A
(nx1)是需要发现的矩阵。 B
和A
的值在开头填充,线性增加值。首先,我只需将B
设置为2。
以下是说明问题的测试程序:
n
在每个时代,我都希望n = 2
A_actual = numpy.linspace(0, 1, n**2).reshape(n, n)
B_actual = numpy.linspace(0, 1, n).reshape(n, 1)
A = tensorflow.Variable(tensorflow.ones((n, n)), name='A')
B = tensorflow.Variable(tensorflow.ones((n, 1)), name='B')
X = tensorflow.placeholder("float", shape=[1, n], name='X')
y = tensorflow.placeholder("float", name='y')
y_hat = tensorflow.matmul(tensorflow.nn.sigmoid(tensorflow.matmul(X, A)), B)
loss = tensorflow.losses.mean_squared_error(labels=y, predictions=y_hat)
cost = tensorflow.reduce_mean(loss)
updates = tensorflow.train.GradientDescentOptimizer(0.01).minimize(cost)
with tensorflow.Session() as sess:
init = tensorflow.global_variables_initializer()
sess.run(init)
for epoch in range(1, 10):
train_X = numpy.random.rand(n).reshape(1, n)
h = numpy.matmul(train_X, A_actual)
train_y = numpy.matmul(h / (numpy.exp(-h) + 1), B_actual)
_, c = sess.run([updates, loss], {X: train_X, y: train_y })
A_guess = A.eval()
B_guess = B.eval()
# work out the expected loss:
h_guess = numpy.matmul(train_X, A_guess)
train_y = numpy.matmul(h / (numpy.exp(-h) + 1), B_actual)
y_hat = numpy.matmul(h_guess / (numpy.exp(-h_guess) + 1), B_guess)
expected_cost = (train_y - y_hat)**2
print "A={}, B={}, train_X = {}, c={}, expected_c={}".format(A_guess, B_guess, train_X, c, expected_cost)
和c
的值匹配,但它们并不相同。这是几个时代的输出:
expected_c
似乎A=[[0.99831355 0.99831355]
[0.9978205 0.9978205 ]], B=[[0.9855833]
[0.9855833]], train_X = [[0.43161333 0.55779766]], c=0.977798759937, expected_c=[[0.90071899]]
A=[[0.99674106 0.99674106]
[0.99594545 0.99594545]], B=[[0.97247064]
[0.97247064]], train_X = [[0.75101306 0.89550778]], c=0.612140238285, expected_c=[[3.25077074]]
A=[[0.9963331 0.9963331]
[0.9934323 0.9934323]], B=[[0.9615876]
[0.9615876]], train_X = [[0.15488769 0.95426499]], c=0.524783551693, expected_c=[[0.73085703]]
A=[[0.99290335 0.99290335]
[0.9930714 0.9930714 ]], B=[[0.9457934]
[0.9457934]], train_X = [[0.7305608 0.07687351]], c=1.30655503273, expected_c=[[0.74179058]]
A=[[0.9906516 0.9906516]
[0.9914385 0.9914385]], B=[[0.93114746]
[0.93114746]], train_X = [[0.74625195 0.54115622]], c=0.876540482044, expected_c=[[1.72666188]]
A=[[0.9897084 0.9897084]
[0.9894199 0.9894199]], B=[[0.91981167]
[0.91981167]], train_X = [[0.39296997 0.84106038]], c=0.538159787655, expected_c=[[1.05986646]]
A=[[0.9873394 0.9873394]
[0.9880559 0.9880559]], B=[[0.9053085]
[0.9053085]], train_X = [[0.7454906 0.42922246]], c=0.906145870686, expected_c=[[1.32207708]]
A=[[0.98699 0.98699 ]
[0.9865663 0.9865663]], B=[[0.89463204]
[0.89463204]], train_X = [[0.0955704 0.4074265]], c=0.737196862698, expected_c=[[0.08112794]]
A=[[0.9847778 0.9847778]
[0.9857968 0.9857968]], B=[[0.88113374]
[0.88113374]], train_X = [[0.5787612 0.20131812]], c=0.975076794624, expected_c=[[0.47560335]]
和c
之间没有任何关系,但我已经仔细检查了我的费用计算(参考tf.sigmoid和tf.losses.mean_squared_error页面)我无法找到任何差异。
为什么这些不是以相同的价值出现?
(请注意,我还没有关注我的猜测似乎没有收敛的事实,一旦我理解了成本函数,我就会担心这一点!)
答案 0 :(得分:0)
正如xdurch0
正确指出的那样,loss
参数是作为图表评估的一部分计算的,因此更新在计算后发生。
为了解决这个问题,我更换了这一行:
_, c = sess.run([updates, loss], {X: train_X, y: train_y })
这一行:
sess.run(updates, {X: train_X, y: train_y })
c = sess.run(loss, {X: train_X, y: train_y })
这会导致loss
被评估两次 - 一次用于updates
调用,一次应用渐变下降后。