Question

我正在关注此处的多层感知器示例：https://github.com/aymericdamien/TensorFlow-Examples我对函数tf.nn.softmax_cross_entropy_with_logits以及它与tf.nn.relu和reduce_sum的关系感到困惑。假设我宣布一份网络作品：

x   = tf.placeholder('float',[None,24**2])
y   = tf.placeholder('float',[None,10])
w1  = tf.Variable(random_normal([24**2,12])
w2  = tf.Variable(random_normal([12,10])
h   = tf.nn.relu(tf.matmul(x,w1))
yhat= tf.matmul(h, w2)

'''
  cost function
'''
cost = tf.reduce_mean(tf.nn.softmax_corss_entropy_with_logits(logits=yhat, labels=y))

上述内容不应与：

相同

x   = tf.placeholder('float',[None,24**2])
y   = tf.placeholder('float',[None,10])
w1  = tf.Variable(random_normal([24**2,12])
w2  = tf.Variable(random_normal([12,10])
h   = tf.nn.relu(tf.matmul(x,w1))
yhat= tf.nn.softmax(tf.matmul(h, w2))

'''
  cost function
'''
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(_hat),reduction_indices=1))

但是当我使用第一个构造进行训练时，我的准确度大约为95%，第二种方法的准确度为1%，那么显然它不仅仅是“数值不稳定”是正确的吗？

有关完整示例，请参阅：https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/multilayer_perceptron.py

Answer 1

做了一些快速研究。我在multilayer_peceptron.py文件中的第62行添加了这个，并在第87行打印

cost_v2 = tf.reduce_mean(-tf.reduce_sum(y*tf.log(tf.nn.softmax(pred)),1))

在第一批时，它出现为nan因为pred实际上在softmax之后包含了很多零。我猜测交叉熵忽略了零，只是根据这个来计算总和：https://datascience.stackexchange.com/questions/9302/the-cross-entropy-error-function-in-neural-networks

Tensorflow softmax_cross_entropy_with_logits与tf.reduce_mean（-tf.reduce_sum（y * tf.log（yhat），reduction_indices = 1））

1 个答案: