我在运行sess.run(不参考训练步长)时,模型中的权重得到更新的事实困扰着我。
我尝试用变量填充模型以获取估计的输出,但是当我运行sess.run时,权重得到更新。
### in the training phase ####
X_eval, Y_eval, O_eval, W_eval, cost_eval, train_step_eval = sess.run([X, Y, O_out, W, cost, train_step], feed_dict={X:x_batch , Y:y_batch})
### when the training is finished (closed for loop) ###
print(W_eval)
Y_out, W_eval2 = sess.run([O_out, W], feed_dict = {X:labeled_features[:,: - n_labels], Y:labeled_features[:,- n_labels :]})
print(W_eval2)
当我比较W_eval和W_eval2时,它们并不相同,我不明白为什么。 您能指出我正确的方向吗,为什么权重不一样?
'w3': array([[-2.9685912],
[-3.215485 ],
[ 3.8806837],
[-3.331745 ],
[-3.3904853]], dtype=float32
'w3': array([[-2.9700036],
[-3.2168453],
[ 3.8804765],
[-3.3330843],
[-3.3922129]], dtype=float32
谢谢。
编辑。添加了W_eval分配。
答案 0 :(得分:1)
您的代码
### in the training phase ####
X_eval, Y_eval, O_eval, W_eval, cost_eval, train_step_eval = sess.run([X, Y, O_out, W, cost, train_step], feed_dict={X:x_batch , Y:y_batch})
### when the training is finished (closed for loop) ###
print(W_eval)
Y_out, W_eval2 = sess.run([O_out, W], feed_dict = {X:labeled_features[:,: - n_labels], Y:labeled_features[:,- n_labels :]})
print(W_eval2)
仍执行train_step
。理解发生了什么的一个更简单的版本是:
import tensorflow as tf
a = tf.get_variable('a', initializer=42.)
train_step = a.assign(a + 1)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
value, _ = sess.run([a, train_step]) # will update a
print(value)
value = sess.run([a]) # will not update a
print(value)
value = sess.run([a]) # will not update a
print(value)
提供输出
42.0
[43.0]
[43.0]
要检查的另一件事是x_batch == labeled_features[:,: - n_labels]
是否成立。