2层NN权重未更新

时间:2016-12-09 06:10:59

标签: python-2.7 neural-network tensorflow

我有一个相当简单的NN,它有1个隐藏层。

但是,权重似乎没有更新。或许它们可能但变量值不会改变?

无论哪种方式,我的准确度为0.1,无论我改变学习率还是激活功能,它都不会改变。不确定有什么问题。有什么想法吗?

我已经正确发布了整个代码格式化程序,因此您可以直接复制粘贴它并在本地计算机上运行它。

var a = document.getElementById('a');
var b = document.getElementById('b');
b.style.cursor = 'pointer';
a.style.cursor = 'pointer';
a.onclick = function() {
var myButtonClasses = document.getElementById("a").classList;
   if (myButtonClasses.contains("blue")) {
      myButtonClasses.remove("blue");
   }else {
      myButtonClasses.add("blue");
   }
};
b.onclick = function() {
  var myButtonClasses = document.getElementById("b").classList;
    if (myButtonClasses.contains("blue")) {
      myButtonClasses.remove("blue");
    }else {
      myButtonClasses.add("blue");
    }
};

1 个答案:

答案 0 :(得分:1)

您持续获得0.1准确度的原因主要是由于输入占位符的维度顺序及其后的权重。学习率是另一个因素。如果学习率非常高,则梯度将会振荡并且不会达到任何最小值。

Tensorflow将实例数(批次)作为占位符的第一个索引值。所以声明输入x的代码

x = tf.placeholder(tf.float32, [784, None],name='x')

应声明为

x = tf.placeholder(tf.float32, [None, 784],name='x')

因此,W1应声明为

W1 = tf.Variable(tf.truncated_normal([784, 25],stddev= 1.0/math.sqrt(784)),name='W')

等等。即使偏置变量也应该在转置意义上声明。 (那就是tensorflow如何接受它:))

例如

b1 = tf.Variable(tf.zeros([25]),name='b1') 
b2 = tf.Variable(tf.zeros([25]),name='b2') 
b3 = tf.Variable(tf.zeros([10]),name='b3')

我将下面更正的完整代码供您参考。我用这个:D

达到了0.9262的准确度
from tensorflow.examples.tutorials.mnist import input_data
import math
import numpy as np
import tensorflow as tf

# one hot option returns binarized labels. 
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)   
# model parameters 
x = tf.placeholder(tf.float32, [None, 784],name='x')
# weights 
W1 = tf.Variable(tf.truncated_normal([784, 25],stddev= 1.0/math.sqrt(784)),name='W') 
W2 = tf.Variable(tf.truncated_normal([25, 25],stddev=1.0/math.sqrt(25)),name='W')  
W3 = tf.Variable(tf.truncated_normal([25, 10],stddev=1.0/math.sqrt(25)),name='W') 

# bias units 
b1 = tf.Variable(tf.zeros([25]),name='b1') 
b2 = tf.Variable(tf.zeros([25]),name='b2') 
b3 = tf.Variable(tf.zeros([10]),name='b3')

# NN architecture 
hidden1 = tf.nn.relu(tf.matmul(x, W1,name='hidden1')+b1, name='hidden1_out')

# hidden2 = tf.nn.sigmoid(tf.matmul(W2, hidden1, name='hidden2')+b2, name='hidden2_out')

y = tf.matmul(hidden1, W3,name='y') + b3

y_ = tf.placeholder(tf.float32, [None, 10],name='y_')

# Create the model   
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_)) 
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)  

sess = tf.Session()   
summary_writer = tf.train.SummaryWriter('log_simple_graph', sess.graph)   
init = tf.initialize_all_variables()   
sess.run(init)

for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    summary =sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
    if summary is not None:
        summary_writer.add_event(summary)

# Test trained model 
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(sess.run(accuracy, feed_dict={x: mnist.test.images,  y_: mnist.test.labels}))