Question

代码取自：-http://adventuresinmachinelearning.com/python-tensorflow-tutorial/

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Python optimisation variables
learning_rate = 0.5
epochs = 10
batch_size = 100

# declare the training data placeholders
# input x - for 28 x 28 pixels = 784
x = tf.placeholder(tf.float32, [None, 784])
# now declare the output data placeholder - 10 digits
y = tf.placeholder(tf.float32, [None, 10])
# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')
# calculate the output of the hidden layer
hidden_out = tf.add(tf.matmul(x, W1), b1)
hidden_out = tf.nn.relu(hidden_out)
# now calculate the hidden layer output - in this case, let's use a softmax activated
# output layer
y_ = tf.nn.softmax(tf.add(tf.matmul(hidden_out, W2), b2))
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)
cross_entropy = -tf.reduce_mean(tf.reduce_sum(y * tf.log(y_clipped)
                         + (1 - y) * tf.log(1 - y_clipped), axis=1))
# add an optimiser
optimiser = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)
# finally setup the initialisation operator
init_op = tf.global_variables_initializer()

# define an accuracy assessment operation
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# start the session
with tf.Session() as sess:
   # initialise the variables
   sess.run(init_op)
   total_batch = int(len(mnist.train.labels) / batch_size)
   for epoch in range(epochs):
        avg_cost = 0
        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size=batch_size)
            _, c = sess.run([optimiser, cross_entropy], 
                         feed_dict={x: batch_x, y: batch_y})
            avg_cost += c / total_batch
        print("Epoch:", (epoch + 1), "cost =", "{:.3f}".format(avg_cost))
   print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))

我想问一下，tensorflow如何识别需要优化的参数，就像上面的代码一样，我们需要优化w1，w2，b1和b2，但是我们从未在任何地方指定它。我们确实要求GradientDescentOptimizer最小化cross_entropy，但我们从未告诉过它必须更改w1，w2，b1＆b2的值才能这样做，那么它怎么知道cross_entropy所依赖的参数呢？

Answer 1

TensorFlow在称为计算图的前提下工作。本质上，只要您说出类似以下内容：

hidden_out = tf.add(tf.matmul(x, W1), b1)

TensorFlow可以，所以输出显然取决于W1，我将一条边缘从“ hidden_out”连接到W1。 y_，y_clipped和cross_entropy也会发生相同的过程。所以最后您有了一个将cross_entropy与W1连接的图。选择您喜欢的图遍历算法，TensorFlow就会找到交叉熵和W1之间的联系。

Answer 2

Cory Nezin的回答仅部分正确，可能导致错误的假设！

您实际上要做指定要优化的参数（=可训练的），即通过执行以下操作：

# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')

简而言之，TensorFlow将仅更新tf.Variables。如果使用tf.Variable(...,trainable=False)之类的内容，则无论“网络依赖”什么，都不会得到任何更新。您仍将指定它，并且网络仍将通过该部分传播，但是您永远不会收到该特定变量的任何更新。

Cory的答案是正确的，因为网络会自动识别要更新的值，但是您指定必须首先定义/更新的内容！

TensorFlow如何知道需要更改哪些变量以进行优化？

2 个答案: