我是机器学习的新手。我以最简单的具有mnmax和梯度下降的mnist手写图像分类示例开始。通过引用其他示例,我在下面提出了自己的Logistic回归:
import tensorflow as tf
import numpy as np
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = np.float32(x_train / 255.0)
x_test = np.float32(x_test / 255.0)
X = tf.placeholder(tf.float32, [None, 28, 28])
Y = tf.placeholder(tf.uint8, [100])
XX = tf.reshape(X, [-1, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
def err(x, y):
predictions = tf.matmul(x, W) + b
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=tf.reshape(y, [-1, 1]), logits=predictions))
# value = tf.reduce_mean(y * tf.log(predictions))
# loss = -tf.reduce_mean(tf.one_hot(y, 10) * tf.log(predictions)) * 100.
return loss
# cost = err(np.reshape(x_train[:100], (-1, 784)), y_train[:100])
cost = err(tf.reshape(X, (-1, 784)), Y)
optimizer = tf.train.GradientDescentOptimizer(0.005).minimize(cost)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# temp = sess.run(tf.matmul(XX, W) + b, feed_dict={X: x_train[:100]})
temp = sess.run(cost, feed_dict={X: x_train[:100], Y: y_train[:100]})
print(temp)
# print(temp.dtype)
# print(type(temp))
for i in range(100):
sess.run(optimizer, feed_dict={X: x_train[i * 100: 100 * (i + 1)], Y: y_train[i * 100: 100 * (i + 1)]})
# sess.run(optimizer, feed_dict={X: x_train[: 100], Y: y_train[:100]})
temp = sess.run(cost, feed_dict={X: x_train[:100], Y: y_train[:100]})
print(temp)
sess.close()
我尝试运行优化器一些迭代,将数据与火车图像数据和标签一起输入。据我了解,在优化程序运行期间,应更新变量“ W”和“ b”,以便模型在训练前后会产生不同的结果。但是使用此代码,优化程序运行前后模型的印刷成本是相同的。发生这种情况可能有什么错误?
答案 0 :(得分:1)
您正在使用零初始化权重矩阵W
,因此,在每次权重更新时,所有参数都将收到相同的梯度值。对于权重初始化,请使用tf.truncated_normal()
,tf.random_normal()
,tf.contrib.layers.xavier_initializer()
或其他名称,但不能为零。
This是一个类似的问题。