简单的Tensorflow架构不训练

时间:2018-08-07 13:55:13

标签: python tensorflow

我正在训练一个学习身份映射的简单网络。这很简单:输入x是一个数字,然后乘以权重w得到输出y

权重w初始化为0.5,但应朝真实值1.0移动。但是,在训练网络后,权重仍为0.5

import tensorflow as tf
tf.reset_default_graph()
sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, shape=[None])

with tf.variable_scope('weight', reuse=True):
    w = tf.Variable([0.5])

weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='weight')

y = w*x
loss = tf.reduce_mean(y-x)
train_step = tf.train.AdamOptimizer(1e-3).minimize(loss, var_list=weights)
sess.run(tf.global_variables_initializer())
sess.run(train_step, feed_dict= {x:[2.0,3.5,4.6,7.8,6.5],y:[2.0,3.5,4.6,7.8,6.5]})

print(sess.run(weights))
#[array([ 0.49900001], dtype=float32)]

对于这样一个简单的网络/问题,我希望w也能很快收敛到1.0

编辑:

当我训练了多个时期

for _ in range(10000):
    sess.run(train_step, feed_dict= {x:[2.0,3.5,4.6,7.8,6.5],y:[2.0,3.5,4.6,7.8,6.5]})

权重发散到:

[array([-99.50284576], dtype=float32)]

编辑2:

我还发现我的损失被计算为零。我不确定发生了什么??????

data = [np.random.randn() for _ in range(100)]

for _ in range(100):
    _, loss_val = sess.run([train_step,loss] , feed_dict= {x:data,y:data})
    print ('loss = ' , loss_val)

输出:

loss =  0.0
loss =  0.0
loss =  0.0
loss =  0.0
loss =  0.0
loss =  0.0
...

1 个答案:

答案 0 :(得分:1)

1>成本函数:MSE

2>为真实目标添加一个占位符

import tensorflow as tf
tf.reset_default_graph()
sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, shape=[None])
# placeholder for true target
y = tf.placeholder(tf.float32, shape=[None])

with tf.variable_scope('weight', reuse=True):
    w = tf.Variable([0.5])

weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='weight')

y_pred = w*x
# we choose mse as cost function
loss = tf.reduce_mean((y_pred-y)**2)
train_step = tf.train.AdamOptimizer(1e-3).minimize(loss, var_list=weights)
sess.run(tf.global_variables_initializer())
for _ in range(10000):
    sess.run(train_step, feed_dict= {x:[2.0,3.5,4.6,7.8,6.5],
                                     y:[2.0,3.5,4.6,7.8,6.5]})

print(w.eval())

输出:[1。]

在您的代码中,预测w*x实际上不会生效,因为您始终向y提供常量数组