Question

我遇到了一个问题，我试图使用MNIST数据集的tensorflow创建一个深度ReLU网络。当我使用我的损失作为内置的 tf.nn.softmax_cross_entropy_with_logits（） 时，它工作得很好，但是手动计算熵项似乎并不像上班。

这是网络的样子：

getFragmentManager().beginTransaction().replace(R.id.fragment_container, new BlankFragment()).commit();

换句话说，这很好用：

train_subset = 200
num_features = 784
num_labels = 10
num_units = 200

bias1 = tf.Variable(tf.constant(0.1, shape=[num_units]), name="bias1")
bias2= tf.Variable(tf.constant(0.1, shape=[num_units]), name="bias2")
bias3= tf.Variable(tf.constant(0.1, shape=[num_units]), name="bias3")
bias_out = tf.Variable(tf.constant(0.1, shape=[num_labels]), name="bias_out")

weights1 = tf.Variable(tf.random_normal([num_features, num_units]), name="weights_layer1")
weights2 = tf.Variable(tf.random_normal([num_units, num_units]), name="weights_layer2")
weights3 = tf.Variable(tf.random_normal([num_units, num_units]), name="weights_layer3")
weights_out = tf.Variable(tf.random_normal([num_units, num_labels]), name="weights_out")

# The deep ReLU network
h_relu1 = tf.nn.relu(tf.add(tf.matmul(x, weights1), bias1))
h_relu2 = tf.nn.relu(tf.add(tf.matmul(h_relu1, weights2), bias2))
h_relu3 = tf.nn.relu(tf.add(tf.matmul(h_relu2, weights3), bias3))
logits = tf.matmul(h_relu3, weights_out) + bias_out

但不是这样：

# Assume that y_ is fed a batch of output labels for MNIST
y_ = tf.placeholder(tf.float32, [None, num_labels], name='y-input')
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, y_))
optimizer = tf.train.AdamOptimizer(1e-3).minimize(cost)

后者运行良好，但在初始步骤后准确性会卡住。前者使用softmax_cross_entropy_with_logits函数实际上确实学到了一些东西。我已经看到后者的设置被用于深入的MNIST示例，这就是为什么我想知道我的设置是什么导致优化过程失速。

Answer 1

我认为您为了准确计算成本而错过了一些步骤。考虑查看nn_ops.py中的源代码，了解softmax_cross_entropy_with_logits正在做什么。

Answer 2

<强>更新

最后，我可以通过自己实现softmax_cross_entropy_with_logits()函数的内部来解决这个问题，你可以在我的GitHub上找到代码here。对于正常和多标签问题，它有两个版本。

上一个答案：

最初来自tensorflow API：

“（请注意，在源代码中，我们不使用此配方，

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

因为它在数值上不稳定。相反，我们对非标准化logits应用tf.nn.softmax_cross_entropy_with_logits（例如，我们在tf.matmul（x，W）+ b上调用softmax_cross_entropy_with_logits），因为这个数值更稳定的函数在内部计算softmax激活。在您的代码中，考虑使用tf.nn.（sparse_）softmax_cross_entropy_with_logits代替）“

来源：https://www.tensorflow.org/versions/r0.11/tutorials/mnist/beginners/

手动计算交叉熵与使用张量流中的softmax_cross_entropy_with_logits

2 个答案: