Question

我最近才刚进入Tensorflow，但在从简单的一层神经网络扩展到多层神经网络时遇到了一些麻烦。我从尝试中粘贴了下面的代码，对于无法正常工作的任何帮助将不胜感激！

import tensorflow as tf
from tqdm import trange
from tensorflow.examples.tutorials.mnist import input_data

# Import data
mnist = input_data.read_data_sets("datasets/MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, [None, 784])
W0 = tf.Variable(tf.zeros([784, 500]))
b0 = tf.Variable(tf.zeros([500]))
y0 = tf.matmul(x, W0) + b0
relu0 = tf.nn.relu(y0)
W1 = tf.Variable(tf.zeros([500, 100]))
b1= tf.Variable(tf.zeros([100]))
y1 = tf.matmul(relu0, W1) + b1
relu1 = tf.nn.relu(y1)
W2 = tf.Variable(tf.zeros([100, 10]))
b2= tf.Variable(tf.zeros([10]))
y2 = tf.matmul(relu1, W2) + b2
y = y2


# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy =       tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

# Create a Session object, initialize all variables
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Train
for _ in trange(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)    
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

# Test trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Test accuracy: {0}'.format(sess.run(accuracy, feed_dict={x: 
mnist.test.images, y_: mnist.test.labels})))

sess.close()

PS：我知道可以通过使用Keras或什至预构建的Tensorflow层更轻松地实现此代码，但是我试图对库背后的数学有更基本的了解。谢谢！

Answer 1

您有两点要考虑。

1）tf.Variable(tf.zeros([784, 500]))用tf.Variable(tf.random_normal([784, 500]))进行更改，因为最好随机进行权重初始化，而不是从一开始就将其定义为0 s。通过最初将其设置为0（表示所有值都具有相同的值），模型将遵循相同的梯度路径，并且将无法学习。对于开始更改，每个zeros和random_normal一起更改。有更好的方法来首先定义变量，但这将为您提供一个良好的开端

2）您的学习率太高 train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)将此行更改为

train_step = tf.train.GradientDescentOptimizer(0.005).minimize(cross_entropy)

在基本的多层感知器中优化参数的问题

1 个答案: