Question

我正在通过以下示例学习TensorFlow：https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/2_BasicModels/linear_regression.ipynb

我在下面的代码中有一个问题：

X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
In [6]:
# Construct a linear model
pred = tf.add(tf.mul(X, W), b)
In [7]:
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)

tf.reduce_sum的输入是tf.pow（pred-Y，2））这似乎是一个标量（或者我错了？）然后我想知道为什么我们想在标量上做reduce_sum？我在这里想念的是什么？谢谢！

Answer 1

张量pred具有静态未知形状，但是由于broadcasting semantics和tf.add()的tf.mul() - 它将具有相同的动态形状作为输入占位符X的值（也具有静态未知的形状）。

在本教程中，当模型正在训练时，X会提供标量值，因此pred将成为标量（并且tf.reduce_sum()将无效）：

# Fit all training data
for epoch in range(training_epochs):
    for (x, y) in zip(train_X, train_Y):
        sess.run(optimizer, feed_dict={X: x, Y: y})

生成日志消息时，X会提供一个向量（包含所有训练示例），因此pred也将是一个长度相同的向量，tf.reduce_sum()将成本汇总到标量：

    #Display logs per epoch step
    if (epoch+1) % display_step == 0:
        c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})

这个例子有点奇怪，因为我们通常使用mini-batch示例训练TensorFlow模型，但它证明了允许tf.placeholder()具有静态未定义形状的有用性。

Answer 2

我对Tensorflow也很新，但看起来你是对的。目前，训练一次只运行一个样本，因此reduce_sum调用实际上是无用的。

尝试删除对reduce_sum的调用并将其替换为：

cost = tf.pow(pred-Y, 2)/(2*n_samples)

它应该仍然以相同的方式运行，但如果您尝试对批次而不是单个样本进行培训，则会中断。

TensorFlow：函数reduce_sum输入

2 个答案: