当我增加神经元或层数时,我的张量流模型无法训练

时间:2017-04-08 07:56:42

标签: python tensorflow neural-network deep-learning

我使用tensorflow通过参考tensorflow教程[1]制作了卷积神经网络模型。该模型使用卷积滤波器1:[5,5,1,16],filter2:[5,5,16, 32],完全组合层[7 * 7 * 32,1024]和[1024,10],然后使用softmax将其转换为概率。我运行这个模型并失败,因为“损失”没有减少,所有输出都是[0,0,1,0,0,0,0,0,0,0,0]。

然后,我减少了过滤器和神经元的数量,并且它成功了,准确度大约为97%。

当我在相同数量的过滤器和神经元中制作模型时,为什么我不能成功训练?

这是我失败的模型。(我用过“mnist.csv”)

x = tf.placeholder(tf.float32,[None,28*28])
t = tf.placeholder(tf.float32,[None,10])
def weight(shape):
   init = tf.truncated_normal(shape, stddev=0.1)
   return tf.Variable(init)
def bias(shape):
   init = tf.constant(0.1, shape=shape)
   return tf.Variable(init)

def conv2d(x,W):
   return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding="SAME")
def max_pool_22(x):
   return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME")

W_conv1 = weight([5,5,1,16])
b_conv1 = bias([16])

x_image = tf.reshape(x,[-1,28,28,1])


h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)

h_pool1 = max_pool_22(h_conv1)
print(h_pool1.shape)

W_conv2 = weight([5,5,16,64])
b_conv2 = bias([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2) + b_conv2)
h_pool2 = max_pool_22(h_conv2)
W_fc1 = weight([7*7*64,1024])
b_fc1 = bias([1024])

h_pool2_flat = tf.reshape(h_pool2,[-1,7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1) + b_fc1)

W_fc2 = weight([1024,10])
b_fc2 = bias([10])

prediction = tf.nn.softmax(tf.matmul(h_fc1,W_fc2) + b_fc2) 
cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=t,logits=prediction))
train_step = tf.train.AdamOptimizer().minimize(cross_entropy)

correct_prediction =tf.equal(tf.argmax(prediction,1),tf.argmax(t,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for epoch in range(20):
   avg_loss = 0.
   avg_accuracy = 0.
   for i in range(1000):
       ind = np.random.choice(len(x_train),50)
       x_train_batch = x_train[ind]
       t_train_batch = t_train[ind]
       _, loss, a = sess.run([train_step,cross_entropy, accuracy],feed_dict={x:x_train_batch,t:t_train_batch})
       avg_loss += loss/1000
       avg_accuracy += a/1000
   if epoch % 1 == 0:
      print("Step:{0} Loss:{1} TrainAccuracy:{2}".format(epoch,avg_loss,avg_accuracy))

打印( “test_accuracy:{0}”。格式(accuracy.eval(feed_dict = {X:x_test,T:t_test})))

[1]:https://www.tensorflow.org/get_started/mnist/pros enter code here

1 个答案:

答案 0 :(得分:1)

您正在softmax_cross_entropy_with_logits的输出上调用softmax。这适用于softmax两次导致错误的结果。在应用softmax_cross_entropy_with_logits

之前,应在最后一层的线性输出上调用softmax
y = tf.matmul(h_fc1,W_fc2) + b_fc2
cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=t, logits=y))

prediction_probabilities = tf.nn.softmax(y)
prediction_class = tf.argmax(y, 1)

只有在需要每个类的概率时才需要上面的prediction_probabilities张量。否则,您可以直接在argmax上致电y以获取预测的课程。