我在Tensorflow中创建了一个具有一个隐藏层(具有RELU激活功能)的感知器(即具有完全连接层的神经网络),并成功地在MNIST数据上运行它,获得90%以上的准确率。但是当我添加第二个隐藏层时,即使经过许多小批量的随机梯度下降,我的准确率也很低(10%)。为什么会发生这种情况的任何想法?如果它有用,我可以将我的Python代码添加到这篇文章中。
这是我的图表代码(使用Udacity课程的入门代码,但添加了其他图层)。请注意,为简单起见,某些方面已被注释掉 - 但即使使用这个更简单的版本,症状仍然相同(即使经过多次迭代,其准确率仍低于10%):
import tensorflow as tf
batch_size = 128
hidden_size = 256
train_subset = 10000
graph = tf.Graph()
with graph.as_default():
# Input data. For the training data, we use a placeholder that will be fed
# at run time with a training minibatch.
tf_train_dataset = tf.placeholder(tf.float32,
shape=(batch_size, image_size * image_size))
tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
#tf_train_dataset = tf.constant(train_dataset[:train_subset, :])
#tf_train_labels = tf.constant(train_labels[:train_subset])
tf_valid_dataset = tf.constant(valid_dataset)
tf_test_dataset = tf.constant(test_dataset)
# Variables.
weightsToHidden1 = tf.Variable(
tf.truncated_normal([image_size * image_size, hidden_size]))
biasesToHidden1 = tf.Variable(tf.zeros([hidden_size]))
weightsToHidden2 = tf.Variable(
tf.truncated_normal([hidden_size, hidden_size]))
biasesToHidden2 = tf.Variable(tf.zeros([hidden_size]))
weightsToOutput = tf.Variable(
tf.truncated_normal([hidden_size, num_labels]))
biasesToOutput = tf.Variable(tf.zeros([num_labels]))
# Training computation.
logitsToHidden1 = tf.nn.relu(tf.matmul(tf_train_dataset, weightsToHidden1)
+ biasesToHidden1)
validLogitsToHidden1 = tf.nn.relu(tf.matmul(tf_valid_dataset, weightsToHidden1)
+ biasesToHidden1)
testLogitsToHidden1 = tf.nn.relu(tf.matmul(tf_test_dataset, weightsToHidden1)
+ biasesToHidden1)
logitsToHidden2 = tf.nn.relu(tf.matmul(logitsToHidden1, weightsToHidden2)
+ biasesToHidden2)
validLogitsToHidden2 = tf.nn.relu(tf.matmul(validLogitsToHidden1, weightsToHidden2)
+ biasesToHidden2)
testLogitsToHidden2 = tf.nn.relu(tf.matmul(testLogitsToHidden1, weightsToHidden2)
+ biasesToHidden2)
logitsToOutput = tf.matmul(logitsToHidden2, weightsToOutput) + biasesToOutput
validLogitsToOutput = tf.matmul(validLogitsToHidden2, weightsToOutput) + biasesToOutput
testLogitsToOutput = tf.matmul(testLogitsToHidden2, weightsToOutput) + biasesToOutput
loss = (tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(logitsToOutput, tf_train_labels))) #+
# tf.nn.l2_loss(weightsToHidden1) * 0.002 +
#tf.nn.l2_loss(weightsToHidden2) * 0.002 +
#tf.nn.l2_loss(weightsToOutput) * 0.002)
# Optimizer.
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logitsToOutput)
valid_prediction = tf.nn.softmax(validLogitsToOutput)
test_prediction = tf.nn.softmax(testLogitsToOutput)
答案 0 :(得分:0)
将学习率更改为0.01甚至更小的值。它有帮助,但在我的情况下,准确度仍然比两层感知器
更差