如何使用TensorFlow Eager执行更新权重?

时间:2018-11-18 03:45:53

标签: python tensorflow machine-learning

所以我尝试了TensorFlow的急切执行,但是我的实现没有成功。我使用了gradient.tape,并且在程序运行时,任何权重都没有可见的更新。我已经看过一些使用optimizer.apply_gradients()来更新所有变量的示例算法和教程,但是我假设我没有正确使用它。

import tensorflow as tf
import tensorflow.contrib.eager as tfe

# emabling eager execution
tf.enable_eager_execution()

# establishing hyperparameters
LEARNING_RATE = 20
TRAINING_ITERATIONS = 3

# establishing all LABLES
LABELS = tf.constant(tf.random_normal([3, 1]))
# print(LABELS)

# stub statment for input
init = tf.Variable(tf.random_normal([3, 1]))

# declare and intialize all weights
weight1 = tfe.Variable(tf.random_normal([2, 3]))
bias1 = tfe.Variable(tf.random_normal([2, 1]))
weight2 = tfe.Variable(tf.random_normal([3, 2]))
bias2 = tfe.Variable(tf.random_normal([3, 1]))
weight3 = tfe.Variable(tf.random_normal([2, 3]))
bias3 = tfe.Variable(tf.random_normal([2, 1]))
weight4 = tfe.Variable(tf.random_normal([3, 2]))
bias4 = tfe.Variable(tf.random_normal([3, 1]))
weight5 = tfe.Variable(tf.random_normal([3, 3]))
bias5 = tfe.Variable(tf.random_normal([3, 1]))

VARIABLES = [weight1, bias1, weight2, bias2, weight3, bias3, weight4, bias4, weight5, bias5]


def thanouseEyes(input):  # nn model aka: Thanouse's Eyes
    layerResult = tf.nn.relu(tf.matmul(weight1, input) + bias1)
    input = layerResult
    layerResult = tf.nn.relu(tf.matmul(weight2, input) + bias2)
    input = layerResult
    layerResult = tf.nn.relu(tf.matmul(weight3, input) + bias3)
    input = layerResult
    layerResult = tf.nn.relu(tf.matmul(weight4, input) + bias4)
    input = layerResult
    layerResult = tf.nn.softmax(tf.matmul(weight5, input) + bias5)
    return layerResult


# Begin training and update variables
optimizer = tf.train.AdamOptimizer(LEARNING_RATE)

with tf.GradientTape(persistent=True) as tape:  # gradient calculation
    for i in range(TRAINING_ITERATIONS):
        COST = tf.reduce_sum(LABELS - thanouseEyes(init))
        GRADIENTS = tape.gradient(COST, VARIABLES)
        optimizer.apply_gradients(zip(GRADIENTS, VARIABLES))
        print(weight1)

1 个答案:

答案 0 :(得分:0)

使用optimizer看起来不错,但是thanouseEyes()定义的计算将始终返回[1.,1.,1.],而与变量无关,因此梯度始终为0,因此变量将永远不会更新(print(thanouseEyes(init))print(GRADIENTS)应该证明这一点)。

再深入一点,将tf.nn.softmax应用于形状为[3,1]的x = tf.matmul(weight5, input) + bias5。因此tf.nn.softmax(x)有效地计算了[softmax(x[0]), softmax(x[1]), softmax(x[2])],因为tf.nn.softmax(默认情况下)应用于输入的最后一个轴。 x[0]x[1]x[2]是具有一个元素的向量,因此softmax(x[i])始终为1.0。

希望有帮助。

您可能对以下与您的问题无关的其他要点:

  • 从TensorFlow 1.11开始,您的程序中不需要tf.contrib.eager模块。将所有出现的tfe替换为tf(即用tf.Variable代替tfe.Variable),您将获得相同的结果

  • GradientTape的上下文中执行的计算被“记录”,即,它保持中间张量,以便以后可以计算梯度。长话短说,您想将GradientTape移动到循环主体中:

-

for i in range(TRAINING_ITERATIONS):
    with tf.GradientTape() as tape:
        COST = tf.reduce_sum(LABELS - thanouseEyes(init))
    GRADIENTS = tape.gradient(COST, VARIABLES)
    optimizer.apply_gradients(zip(GRADIENTS, VARIABLES))