我正在尝试在python环境中完成分类问题(图像分类)的代码。我使用tensorflow库中的梯度下降优化器构建了一个具有定义数量神经元的5层NN。第一个“完整”代码带来了高精度,但由于过度拟合问题已经显示,我只是决定引入一个退出正则化程序。在第一次运行结束时,在训练模型超过1000次迭代后,一切看起来都很好,具有训练精度:1.0 和测试精度:0.9606 。几秒钟后,我决定重新运行代码,正如你可以通过以下两个图像看到的,出了点问题。 Prediction accuracy through dropout regularization - FAIL,Cost function through dropout regularization - FAIL 。模拟在最后一次定义的迭代之前停止,没有给出任何输出警告!运行相同的代码,怎么可能?以前有没有人遇到这种问题?这是一个交叉熵函数计算的问题,或者NN精度如何能够瞬间从如此高的精度水平衰减到零?
答案 0 :(得分:0)
这是代码。它基于用于数字分类问题的mnist数据集。如下:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Xflatten=tf.reshape(X, [-1, 784])
Y1f=tf.nn.relu(tf.matmul(Xflatten,W1) + b1)
Y1=tf.nn.dropout(Y1f,pkeep)
Y2f=tf.nn.relu(tf.matmul(Y1,W2) + b2)
Y2=tf.nn.dropout(Y2f,pkeep)
Y3f=tf.nn.relu(tf.matmul(Y2,W3) + b3)
Y3=tf.nn.dropout(Y3f,pkeep)
Y4f=tf.nn.relu(tf.matmul(Y3,W4) + b4)
Y4=tf.nn.dropout(Y4f,pkeep)
Y=tf.nn.softmax(tf.matmul(Y4,W5) + b5)
Y_=tf.placeholder(tf.float32, [None, 10]) #Y_ stands for the labels representing the digits; "one-hot" encoded
cross_entropy= -tf.reduce_sum(Y_ * tf.log(Y)) #tf.reduce_sum() computes the summation of the required elements for the cross entropy computation
is_correct=tf.equal(tf.argmax(Y,1),tf.argmax(Y_,1)) #tf.argmax() allows to make the "one-hot" decoding
accuracy=tf.reduce_mean(tf.cast(is_correct, tf.float32))
optimizer=tf.train.GradientDescentOptimizer(learning_rate)train_step=optimizer.minimize(cross_entropy)
sess=tf.Session()
sess.run(init)
iterations_num=1000
xaxis=np.arange(iterations_num)
num_minibatches = int(50000 / minibatch_size) cost_train=[]
accuracy_train=[]
cost_test=[]
accuracy_test=[]
for i in range(iterations_num):
#load batch of images and correct answers
batch_X, batch_Y = mnist.train.next_batch(100)
train_data={Xflatten: batch_X, Y_: batch_Y}
#train
sess.run(train_step, feed_dict=train_data)
#success?
a_train,c_train=sess.run([accuracy, cross_entropy], feed_dict=train_data)
cost_train.append(c_train)
accuracy_train.append(a_train)
#success on test data?
test_data={Xflatten: mnist.test.images, Y_: mnist.test.labels}
a_test,c_test=sess.run([accuracy, cross_entropy], feed_dict=test_data)
cost_test.append(c_test)
accuracy_test.append(a_test)
plt.plot(xaxis,cost_train,'b',xaxis,cost_test,'r')
plt.ylabel('cost with dropout regularization')
plt.xlabel('iterations')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
plt.plot(xaxis,accuracy_train,'b',xaxis,accuracy_test,'r')
plt.ylabel('accuracy with dropout regularization')
plt.xlabel('iterations')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
print ("Train Accuracy:" + str(a_train))
print ("Test Accuracy:" + str(a_test))
sess.close()