我正在尝试为MNIST数据创建一个简单的线性分类器,我不能让我的损失下降。可能是什么问题呢? 这是我的代码:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
class LinearClassifier(object):
def __init__(self):
print("LinearClassifier loading MNIST")
self._mnist = input_data.read_data_sets("mnist_data/", one_hot = True)
self._buildGraph()
def _buildGraph(self):
self._tf_TrainX = tf.placeholder(tf.float32, [None, self._mnist.train.images.shape[1]])
self._tf_TrainY = tf.placeholder(tf.float32, [None, self._mnist.train.labels.shape[1]])
self._tf_Weights = tf.Variable(tf.random_normal([784,10]), tf.float32)
self._tf_Bias = tf.Variable(tf.zeros([10]), tf.float32)
self._tf_Y = tf.nn.softmax(tf.matmul(self._tf_TrainX, self._tf_Weights) + self._tf_Bias)
self._tf_Loss = tf.reduce_mean(-tf.reduce_sum(self._tf_TrainY * tf.log(self._tf_Y), reduction_indices=[1]))
self._tf_TrainStep = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(self._tf_Loss)
self._tf_CorrectGuess = tf.equal(tf.argmax(self._tf_Y, 1), tf.arg_max(self._tf_TrainY, 1))
self._tf_Accuracy = tf.reduce_mean(tf.cast(self._tf_CorrectGuess, tf.float32))
self._tf_Initializers = tf.global_variables_initializer()
def train(self, epochs, batch_size):
self._sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
self._sess.run(self._tf_Initializers)
for i in range(epochs):
batchX, batchY = self._mnist.train.next_batch(batch_size)
self._loss, _, self._accurracy = self._sess.run([self._tf_Loss, self._tf_TrainStep, self._tf_Accuracy], feed_dict ={self._tf_TrainX: batchX, self._tf_TrainY: batchY})
print("Epoch: {0}, Loss: {1}, Accuracy: {2}".format(i, self._loss, self._accurracy))
当我通过以下方式运行时:
lc = LinearClassifier()
lc.train(1000, 100)
......我得到这样的东西:
Epoch: 969, Loss: 8.19491195678711, Accuracy: 0.17999999225139618
Epoch: 970, Loss: 9.09421157836914, Accuracy: 0.1899999976158142
....
Epoch: 998, Loss: 7.865959167480469, Accuracy: 0.17000000178813934
Epoch: 999, Loss: 9.281349182128906, Accuracy: 0.10999999940395355
tf.train.GradientDescentOptimizer没有正确训练我的体重和偏见的原因是什么?
答案 0 :(得分:3)
主要是你的学习率(0.001)太低了。我将其更改为0.5,就像他们在mnist tensorflow tutorial中所做的那样运行,我的准确性和损失更像是:
Epoch: 997, Loss: 0.6437355875968933, Accuracy: 0.8999999761581421
Epoch: 998, Loss: 0.6129786968231201, Accuracy: 0.8899999856948853
Epoch: 999, Loss: 0.6442205905914307, Accuracy: 0.8999999761581421
另一件有点不寻常的事情是你的原始代码中有这个
self._tf_Y = tf.nn.softmax(tf.matmul(self._tf_TrainX, self._tf_Weights) + self._tf_Bias)
self._tf_Loss = tf.reduce_mean(-tf.reduce_sum(self._tf_TrainY * tf.log(self._tf_Y), reduction_indices=[1]))
在这种情况下,你会做两次softmax。我确实在改变它之前运行它并且列车精度大约是85%所以它确实有所不同。同时做softmax两次在理论上可解释性较差。
最后,他们在教程中提到使用上面的softmax形式-reduce_sum(label * log(y))
,在数值上是不稳定的,因此最好使用内置的softmax层来计算分析上等效但更稳定的softmax。应用这两个更改后,受影响的行看起来像:
self._tf_Y = tf.matmul(self._tf_TrainX, self._tf_Weights) + self._tf_Bias
self._tf_Loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=self._tf_TrainY, logits=self._tf_Y))