我有一个相当简单的NN,它有1个隐藏层。
但是,权重似乎没有更新。或许它们可能但变量值不会改变?
无论哪种方式,我的准确度为0.1,无论我改变学习率还是激活功能,它都不会改变。不确定有什么问题。有什么想法吗?
我已经正确发布了整个代码格式化程序,因此您可以直接复制粘贴它并在本地计算机上运行它。
var a = document.getElementById('a');
var b = document.getElementById('b');
b.style.cursor = 'pointer';
a.style.cursor = 'pointer';
a.onclick = function() {
var myButtonClasses = document.getElementById("a").classList;
if (myButtonClasses.contains("blue")) {
myButtonClasses.remove("blue");
}else {
myButtonClasses.add("blue");
}
};
b.onclick = function() {
var myButtonClasses = document.getElementById("b").classList;
if (myButtonClasses.contains("blue")) {
myButtonClasses.remove("blue");
}else {
myButtonClasses.add("blue");
}
};
答案 0 :(得分:1)
您持续获得0.1准确度的原因主要是由于输入占位符的维度顺序及其后的权重。学习率是另一个因素。如果学习率非常高,则梯度将会振荡并且不会达到任何最小值。
Tensorflow将实例数(批次)作为占位符的第一个索引值。所以声明输入x的代码
x = tf.placeholder(tf.float32, [784, None],name='x')
应声明为
x = tf.placeholder(tf.float32, [None, 784],name='x')
因此,W1应声明为
W1 = tf.Variable(tf.truncated_normal([784, 25],stddev= 1.0/math.sqrt(784)),name='W')
等等。即使偏置变量也应该在转置意义上声明。 (那就是tensorflow如何接受它:))
例如
b1 = tf.Variable(tf.zeros([25]),name='b1')
b2 = tf.Variable(tf.zeros([25]),name='b2')
b3 = tf.Variable(tf.zeros([10]),name='b3')
我将下面更正的完整代码供您参考。我用这个:D
达到了0.9262的准确度from tensorflow.examples.tutorials.mnist import input_data
import math
import numpy as np
import tensorflow as tf
# one hot option returns binarized labels.
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)
# model parameters
x = tf.placeholder(tf.float32, [None, 784],name='x')
# weights
W1 = tf.Variable(tf.truncated_normal([784, 25],stddev= 1.0/math.sqrt(784)),name='W')
W2 = tf.Variable(tf.truncated_normal([25, 25],stddev=1.0/math.sqrt(25)),name='W')
W3 = tf.Variable(tf.truncated_normal([25, 10],stddev=1.0/math.sqrt(25)),name='W')
# bias units
b1 = tf.Variable(tf.zeros([25]),name='b1')
b2 = tf.Variable(tf.zeros([25]),name='b2')
b3 = tf.Variable(tf.zeros([10]),name='b3')
# NN architecture
hidden1 = tf.nn.relu(tf.matmul(x, W1,name='hidden1')+b1, name='hidden1_out')
# hidden2 = tf.nn.sigmoid(tf.matmul(W2, hidden1, name='hidden2')+b2, name='hidden2_out')
y = tf.matmul(hidden1, W3,name='y') + b3
y_ = tf.placeholder(tf.float32, [None, 10],name='y_')
# Create the model
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)
sess = tf.Session()
summary_writer = tf.train.SummaryWriter('log_simple_graph', sess.graph)
init = tf.initialize_all_variables()
sess.run(init)
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
summary =sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
if summary is not None:
summary_writer.add_event(summary)
# Test trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))