Question

我正在检查本教程的深度学习，他制作了一个带有一层隐藏层的简单神经网络。我做了同样的事情，它工作正常（准确度94％），现在我又增加了一层，其准确性降低到了（10％），我不知道为什么？下面是我的代码

`import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data
sess = tf.InteractiveSession()
mnist  = input_data.read_data_sets("MNIST_data/",one_hot=True)

input_images = tf.placeholder(tf.float32,shape=[None,784])
target_labels = tf.placeholder(tf.float32,shape=[None,10])

hidden_nodes1 = 512
hidden_nodes2 = 256

    input_weights = tf.Variable(tf.truncated_normal([784,hidden_nodes1]))
    input_biases = tf.Variable(tf.zeros([hidden_nodes1]))


    hidden_weights1 = tf.Variable(tf.truncated_normal([hidden_nodes1,hidden_nodes2]))
    hidden_biases1 = tf.Variable(tf.zeros([hidden_nodes2]))

    hidden_weights2 = tf.Variable(tf.truncated_normal([hidden_nodes2,10]))
    hidden_biases2 = tf.Variable(tf.zeros([10]))


    input_layer = tf.matmul(input_images,input_weights)
    hidden_layer1 = tf.nn.relu(input_layer + input_biases)

    hidden_layer2 = tf.nn.relu(tf.matmul(hidden_layer1,hidden_weights1) + hidden_biases1)


    digits_weights = tf.matmul(hidden_layer2,hidden_weights2)+hidden_biases2



    loss_funtion = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=digits_weights,labels=target_labels))

    optimizer = tf.train.GradientDescentOptimizer(0.2).minimize(loss_funtion)

    correct_prediction = tf.equal(tf.argmax(digits_weights,1),tf.argmax(target_labels,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
    tf.global_variables_initializer().run()
    for x in range(2000):
        batch = mnist.train.next_batch(100)
        optimizer.run(feed_dict={input_images:batch[0],target_labels:batch[1]})
        if ((x+1)%100==0):
            print("Training Epoc"+str(x+1))
            print("Accuracy"+str(accuracy.eval(feed_dict={input_images:mnist.test.images,target_labels:mnist.test.labels})))`

Answer 1

您的代码实际上很好。但是，通过添加带有256节点的新隐藏层，您将大大增加可学习参数的数量！本质上，您的模型架构已经变得太大。这就是我的建议，您可以将节点数从512和256减少到128或最大256。然后，由于当前学习速率太高，可能无法正确收敛于最小值（甚至可能发散），因此请为学习速率使用低得多的值。因此，我将其更改为0.01或更小。您可以尝试的另一件事是使用AdamOptimizer而不是GradientDescentOptimizer。试试这些，代码应该可以正常工作！

在tensorflow MNIST教程中添加更多层会使准确性下降，有时精度在批处理迭代期间保持恒定

1 个答案: