我使用tensorflow通过参考tensorflow教程[1]制作了卷积神经网络模型。该模型使用卷积滤波器1:[5,5,1,16],filter2:[5,5,16, 32],完全组合层[7 * 7 * 32,1024]和[1024,10],然后使用softmax将其转换为概率。我运行这个模型并失败,因为“损失”没有减少,所有输出都是[0,0,1,0,0,0,0,0,0,0,0]。
然后,我减少了过滤器和神经元的数量,并且它成功了,准确度大约为97%。
当我在相同数量的过滤器和神经元中制作模型时,为什么我不能成功训练?
这是我失败的模型。(我用过“mnist.csv”)
x = tf.placeholder(tf.float32,[None,28*28])
t = tf.placeholder(tf.float32,[None,10])
def weight(shape):
init = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(init)
def bias(shape):
init = tf.constant(0.1, shape=shape)
return tf.Variable(init)
def conv2d(x,W):
return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding="SAME")
def max_pool_22(x):
return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME")
W_conv1 = weight([5,5,1,16])
b_conv1 = bias([16])
x_image = tf.reshape(x,[-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_22(h_conv1)
print(h_pool1.shape)
W_conv2 = weight([5,5,16,64])
b_conv2 = bias([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2) + b_conv2)
h_pool2 = max_pool_22(h_conv2)
W_fc1 = weight([7*7*64,1024])
b_fc1 = bias([1024])
h_pool2_flat = tf.reshape(h_pool2,[-1,7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1) + b_fc1)
W_fc2 = weight([1024,10])
b_fc2 = bias([10])
prediction = tf.nn.softmax(tf.matmul(h_fc1,W_fc2) + b_fc2)
cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=t,logits=prediction))
train_step = tf.train.AdamOptimizer().minimize(cross_entropy)
correct_prediction =tf.equal(tf.argmax(prediction,1),tf.argmax(t,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for epoch in range(20):
avg_loss = 0.
avg_accuracy = 0.
for i in range(1000):
ind = np.random.choice(len(x_train),50)
x_train_batch = x_train[ind]
t_train_batch = t_train[ind]
_, loss, a = sess.run([train_step,cross_entropy, accuracy],feed_dict={x:x_train_batch,t:t_train_batch})
avg_loss += loss/1000
avg_accuracy += a/1000
if epoch % 1 == 0:
print("Step:{0} Loss:{1} TrainAccuracy:{2}".format(epoch,avg_loss,avg_accuracy))
打印( “test_accuracy:{0}”。格式(accuracy.eval(feed_dict = {X:x_test,T:t_test})))
[1]:https://www.tensorflow.org/get_started/mnist/pros enter code here
答案 0 :(得分:1)
您正在softmax_cross_entropy_with_logits
的输出上调用softmax
。这适用于softmax两次导致错误的结果。在应用softmax_cross_entropy_with_logits
:
softmax
y = tf.matmul(h_fc1,W_fc2) + b_fc2
cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=t, logits=y))
prediction_probabilities = tf.nn.softmax(y)
prediction_class = tf.argmax(y, 1)
只有在需要每个类的概率时才需要上面的prediction_probabilities
张量。否则,您可以直接在argmax
上致电y
以获取预测的课程。