Question

我使用输入X（24,1），输出Y（6,1）和变量w，b进行梯度下降优化，但是在第一次迭代中成本变为NaN，甚至学习率为1e-20。我检查了w的渐变值，并在第一次迭代中变为全0。
Tensorflow如何区分梯度下降优化器？我怎么能弄清楚这个问题呢？

X = tf.placeholder(tf.float32,[1,n_func],name="X")
Y = tf.placeholder(tf.float32,[1,n_output], name="Y")

n_hl1 = n_output
# (24*6)
hidden_layer_1 = {'w': tf.Variable(tf.random_normal([n_func, n_hl1]), name='h1w'),
                'b': tf.Variable(tf.random_normal([n_hl1]), name='h1b')}

# (1*24) (24*6) = (1*6)
l1 = tf.add(tf.matmul(X,hidden_layer_1['w']), hidden_layer_1['b'])
l1 = tf.nn.relu(l1)

prediction = l1

# Cost
cost = tf.reduce_mean(tf.square(Y - prediction, name="cost"))

# rate
global_step = tf.Variable(0, trainable=False)
starter_learning_rate = 1.0e-20
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step, epochs/10, 0.8, staircase=True)

# GradientDescentOptimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost, global_step=global_step)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    for _ in range(epochs):
        cost_value = 0
        learning_step = optimizer

        for i in range(n_test):
            c = 0
            _,c = sess.run([optimizer, cost],{X:h2[i],Y:[y[:,i]]})
            cost_value += c

        print cost_value

Tensorflow如何区分梯度下降优化器？

0 个答案: