Question

所以我制作了最简单的模型（感知器/自动编码器）（除输入生成外）如下：

N = 64 * 64 * 3

def main():
    x = tf.placeholder(tf.float32, shape=(None, 64, 64, 3), name="x")

    with tf.name_scope("perceptron"):
        W = tf.Variable(tf.random_normal([N, N], stddev=1), name="W")
        b = tf.Variable(tf.random_normal([], stddev=1), name="b")
        y = tf.add(tf.matmul( tf.reshape(x, [-1,N]), W), b, name="y")
        act = tf.nn.sigmoid(y, name="sigmoid")
        yhat = tf.reshape(act, [-1, 64, 64, 3], name="yhat")

    with tf.name_scope("mse"):
        sq_error = tf.reduce_mean(np.square(x - yhat), axis=1)
        cost = tf.reduce_mean( sq_error, name="cost" )
        tf.summary.scalar("cost", cost)

    with tf.name_scope("conv_opt"): #Should just be called 'opt' here
        training_op = tf.train.AdamOptimizer(0.005).minimize(cost, name="train_op")

    with tf.device("/gpu:0"):
        config = tf.ConfigProto(allow_soft_placement=True)
        config.gpu_options.allow_growth = True
        sess = tf.Session(config=config)
        sess.run(tf.global_variables_initializer())

        logdir = "log_directory"
        if os.path.exists(logdir):
            shutil.rmtree(logdir)
        os.makedirs(logdir)

        input_gen = input.input_generator_factory(...)
        input_gen.initialize((64,64,3), 512)

        merged = tf.summary.merge_all()
        train_writer = tf.summary.FileWriter(logdir, sess.graph)

        for i in range(10):
            batch = input_gen.next_train_batch()
            summary,_ = sess.run([merged, training_op], feed_dict={x : batch})
            train_writer.add_summary(summary, i)
            print("Iteration %d completed" % (i))

if __name__ == "__main__":
    main()

这会产生以下tensorboard graph。无论如何，我认为从'感知'到'conv_opt'的粗箭头（可能只是被称为'opt'，对不起）对应于反向传播的错误信号，（而？x64x64x3箭头对应于推断）。但为什么 12 张量？我不知道这个号码来自哪里。我会期望更少，相当于W和b。有人可以解释一下发生了什么吗？

Answer 1

我认为原因是当你添加tf.train.AdamOptimizer(0.005).minimize(cost) op时，隐含地假设你优化了所有可训练的变量（因为你没有另外指定）。因此，您需要知道这些变量的值以及参与cost计算的所有中间张量的值，包括渐变（也是张量，并隐式添加到计算图中）。现在我们来计算perceptron的变量和张量：

W
b
tf.reshape(x, [-1,N])
tf.matmul( ..., W)
相对于第一个参数的渐变。
相对于第二个参数的渐变。
tf.add(..., b, name="y")
相对于第一个参数的渐变。
相对于第二个参数的渐变。
tf.nn.sigmoid(y, name="sigmoid")
它的渐变。
tf.reshape(act, [-1, 64, 64, 3], name="yhat")

我实际上并不是100％确定会计是如何完成的，但是你可以知道12号可能来自哪里。

就像练习一样，我们可以看到这种类型的会计也解释了数字9来自图表的位置：

x - yhat
相对于第一个参数的渐变
相对于第二个参数的渐变
np.square(...)
其渐变
tf.reduce_mean(..., axis=1)
其渐变
tf.reduce_mean( sq_error, name="cost" )
其渐变

了解张量板：为什么12个张量器被发送到优化器？

1 个答案: