Question

所以我尝试了TensorFlow的急切执行，但是我的实现没有成功。我使用了gradient.tape，并且在程序运行时，任何权重都没有可见的更新。我已经看过一些使用optimizer.apply_gradients()来更新所有变量的示例算法和教程，但是我假设我没有正确使用它。

import tensorflow as tf
import tensorflow.contrib.eager as tfe

# emabling eager execution
tf.enable_eager_execution()

# establishing hyperparameters
LEARNING_RATE = 20
TRAINING_ITERATIONS = 3

# establishing all LABLES
LABELS = tf.constant(tf.random_normal([3, 1]))
# print(LABELS)

# stub statment for input
init = tf.Variable(tf.random_normal([3, 1]))

# declare and intialize all weights
weight1 = tfe.Variable(tf.random_normal([2, 3]))
bias1 = tfe.Variable(tf.random_normal([2, 1]))
weight2 = tfe.Variable(tf.random_normal([3, 2]))
bias2 = tfe.Variable(tf.random_normal([3, 1]))
weight3 = tfe.Variable(tf.random_normal([2, 3]))
bias3 = tfe.Variable(tf.random_normal([2, 1]))
weight4 = tfe.Variable(tf.random_normal([3, 2]))
bias4 = tfe.Variable(tf.random_normal([3, 1]))
weight5 = tfe.Variable(tf.random_normal([3, 3]))
bias5 = tfe.Variable(tf.random_normal([3, 1]))

VARIABLES = [weight1, bias1, weight2, bias2, weight3, bias3, weight4, bias4, weight5, bias5]


def thanouseEyes(input):  # nn model aka: Thanouse's Eyes
    layerResult = tf.nn.relu(tf.matmul(weight1, input) + bias1)
    input = layerResult
    layerResult = tf.nn.relu(tf.matmul(weight2, input) + bias2)
    input = layerResult
    layerResult = tf.nn.relu(tf.matmul(weight3, input) + bias3)
    input = layerResult
    layerResult = tf.nn.relu(tf.matmul(weight4, input) + bias4)
    input = layerResult
    layerResult = tf.nn.softmax(tf.matmul(weight5, input) + bias5)
    return layerResult


# Begin training and update variables
optimizer = tf.train.AdamOptimizer(LEARNING_RATE)

with tf.GradientTape(persistent=True) as tape:  # gradient calculation
    for i in range(TRAINING_ITERATIONS):
        COST = tf.reduce_sum(LABELS - thanouseEyes(init))
        GRADIENTS = tape.gradient(COST, VARIABLES)
        optimizer.apply_gradients(zip(GRADIENTS, VARIABLES))
        print(weight1)

Answer 1

使用optimizer看起来不错，但是thanouseEyes()定义的计算将始终返回[1.，1.，1.]，而与变量无关，因此梯度始终为0，因此变量将永远不会更新（print(thanouseEyes(init))和print(GRADIENTS)应该证明这一点）。

再深入一点，将tf.nn.softmax应用于形状为[3，1]的x = tf.matmul(weight5, input) + bias5。因此tf.nn.softmax(x)有效地计算了[softmax(x[0]), softmax(x[1]), softmax(x[2])]，因为tf.nn.softmax（默认情况下）应用于输入的最后一个轴。 x[0]，x[1]和x[2]是具有一个元素的向量，因此softmax(x[i])始终为1.0。

希望有帮助。

您可能对以下与您的问题无关的其他要点：

从TensorFlow 1.11开始，您的程序中不需要tf.contrib.eager模块。将所有出现的tfe替换为tf（即用tf.Variable代替tfe.Variable），您将获得相同的结果

GradientTape

-

for i in range(TRAINING_ITERATIONS):
    with tf.GradientTape() as tape:
        COST = tf.reduce_sum(LABELS - thanouseEyes(init))
    GRADIENTS = tape.gradient(COST, VARIABLES)
    optimizer.apply_gradients(zip(GRADIENTS, VARIABLES))

如何使用TensorFlow Eager执行更新权重？

1 个答案: