我使用CNN训练了MNIST
的模型,但是当我在训练后用测试数据检查模型的准确性时,我发现我的准确性会提高。这是代码。
BATCH_SIZE = 50
LR = 0.001 # learning rate
mnist = input_data.read_data_sets('./mnist', one_hot=True) # they has been normalized to range (0,1)
test_x = mnist.test.images[:2000]
test_y = mnist.test.labels[:2000]
def new_cnn(imageinput, inputshape):
weights = tf.Variable(tf.truncated_normal(inputshape, stddev = 0.1),name = 'weights')
biases = tf.Variable(tf.constant(0.05, shape = [inputshape[3]]),name = 'biases')
layer = tf.nn.conv2d(imageinput, weights, strides = [1, 1, 1, 1], padding = 'SAME')
layer = tf.nn.relu(layer)
return weights, layer
tf_x = tf.placeholder(tf.float32, [None, 28 * 28])
image = tf.reshape(tf_x, [-1, 28, 28, 1]) # (batch, height, width, channel)
tf_y = tf.placeholder(tf.int32, [None, 10]) # input y
# CNN
weights1, layer1 = new_cnn(image, [5, 5, 1, 32])
pool1 = tf.layers.max_pooling2d(
layer1,
pool_size=2,
strides=2,
) # -> (14, 14, 32)
weight2, layer2 = new_cnn(pool1, [5, 5, 32, 64]) # -> (14, 14, 64)
pool2 = tf.layers.max_pooling2d(layer2, 2, 2) # -> (7, 7, 64)
flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) # -> (7*7*64, )
hide = tf.layers.dense(flat, 1024, name = 'hide') # hidden layer
output = tf.layers.dense(hide, 10, name = 'output')
loss = tf.losses.softmax_cross_entropy(onehot_labels=tf_y, logits=output) # compute cost
accuracy = tf.metrics.accuracy( labels=tf.argmax(tf_y, axis=1), predictions=tf.argmax(output, axis=1),)[1]
train_op = tf.train.AdamOptimizer(LR).minimize(loss)
sess = tf.Session()
init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer()) # the local var is for accuracy
sess.run(init_op) # initialize var in graph
saver = tf.train.Saver()
for step in range(101):
b_x, b_y = mnist.train.next_batch(BATCH_SIZE)
_, loss_ = sess.run([train_op, loss], {tf_x: b_x, tf_y: b_y})
if step % 50 == 0:
print(loss_)
accuracy_, loss2 = sess.run([accuracy, loss], {tf_x: test_x, tf_y: test_y })
print('Step:', step, '| test accuracy: %f' % accuracy_)
为了简化问题,我只使用100次训练迭代。测试集的最终准确度约为0.655000
。
但是当我运行以下代码时:
for i in range(5):
accuracy2 = sess.run(accuracy, {tf_x: test_x, tf_y: test_y })
print(sess.run(weight2[1,:,0,0])) # To show that the model parameters won't update
print(accuracy2)
输出
[-0.06928255 -0.13498515 0.01266837 0.05656774 0.09438231]
0.725875
[-0.06928255 -0.13498515 0.01266837 0.05656774 0.09438231]
0.7684
[-0.06928255 -0.13498515 0.01266837 0.05656774 0.09438231]
0.79675
[-0.06928255 -0.13498515 0.01266837 0.05656774 0.09438231]
0.817
[-0.06928255 -0.13498515 0.01266837 0.05656774 0.09438231]
0.832187
这让我很困惑,有人可以告诉我什么是错的吗? 谢谢你的耐心等待!
答案 0 :(得分:0)
tf.metrics.accuracy
并不像你想象的那么简单。看看它的文档:
accuracy
函数创建两个局部变量total
和
count
用于计算频率predictions
匹配labels
。这个频率最终是 返回为accuracy
:一个简单划分的幂等操作total
count
。{/ p>在内部,
is_correct
操作会计算Tensor
元素1.0中predictions
和{的相应元素labels
匹配,否则为0.0。然后update_op
递增total
weights
和is_correct
的乘积减少count
weights
,并以减少的总和递增update_op
accuracy
。用于估计数据流上的度量,函数 创建一个
Tensor
操作来更新这些变量和 返回total
。...
返回:
- 准确度:
count
表示准确性,total
的值除以count
。- update_op:增加
accuracy
和update_op
变量的操作 适当且其值与update_op
匹配。
请注意,它会返回一个元组,而您将获取第二个项,即accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(tf_y, axis=1), tf.argmax(output, axis=1)), tf.float32))
。 collate
的连续调用被视为数据流,这不是您打算做的(因为每次评估在培训期间将影响未来的评估)。实际上,此运行指标为pretty counter-intuitive。
您的解决方案是使用简单的精确计算。将此行更改为:
--executing below works fine
select * from @table a join @table1 b
on a.col1 collate Latin1_General_CS_AS = b.col2
--executing below also works fine
select * from @table a join @table1 b
on a.col1 = b.col2 collate SQL_Latin1_General_CP1_CI_AS
并且您将获得稳定的准确度计算。