使用tf.function装饰器时渐变为None

时间:2019-04-19 19:06:01

标签: python tensorflow tensorflow2.0

我正在尝试将我的代码迁移到tensorflow 2.0,但是我无法使用tf.function创建显式图。特别是,给定以下模型:

def new_dueling_model(name, input_size, output_size, learning_rate, clip_grad=False):
    states = tf.keras.Input(shape=(input_size,))
    h1 = tf.keras.layers.Dense(256, activation='relu')(states)

    # State value function
    value_h2 = tf.keras.layers.Dense(128, activation='relu')(h1)
    value_output = tf.keras.layers.Dense(1)(value_h2)

    # Advantage function
    advantage_h2 = tf.keras.layers.Dense(128, activation='relu')(h1)
    advantage_output = tf.keras.layers.Dense(output_size)(advantage_h2)

    outputs = value_output + (advantage_output - tf.reduce_mean(advantage_output, axis=1, keepdims=True))

    model = tf.keras.Model(inputs=states, outputs=outputs, name=name)

    return model

以及以下训练它的功能:

def q_train(states, actions, targets, is_weights, model, output_size, learning_rate, clip_grad):
    optimizer = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)

    with tf.GradientTape() as tape:
        outputs = model(states)
        q_values = tf.multiply(outputs, (tf.one_hot(tf.squeeze(actions), output_size)))

        loss_value = tf.reduce_mean(is_weights * tf.losses.mean_squared_error(targets, q_values))


    grads = tape.gradient(loss_value, model.trainable_variables)

    selected_q_values = tf.reduce_sum(q_values, axis=1)
    selected_targets = tf.reduce_sum(targets, axis=1)
    td_errors = tf.clip_by_value(selected_q_values - selected_targets, -1.0, 1.0)

    if clip_grad:
        optimizer.apply_gradients(zip([tf.clip_by_value(grad, -1.0, 1.0) for grad in grads], model.trainable_variables))
    else:
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

    return td_errors

我在主循环中有以下调用来训练模型:

   # states, actions, targets and is_weights are numpy arrays
   # model is created using new_dueling_model
   td_errors = q_train(states, actions, targets, is_weights, model, num_actions, 0.00025, False)
   # ...

一切正常,并且可以预期,将其与tf1.x代码进行比较,训练步骤要慢得多。因此,我将q_train函数修饰为具有高性能的tf图。但是,现在,每次我调用该函数时,grad始终为None。

@tf.function
def q_train(...):
    # ...
    grads = tape.gradient(loss_value, model.trainable_variables)
    # grads here are None 

出什么问题了?

1 个答案:

答案 0 :(得分:0)

我解决了这个问题。 首先,我使用以下方法安装了每晚软件包:

pip install tf-nightly-2.0-preview

这时,运行代码后,出现以下错误:

ValueError: tf.function-decorated function tried to create variables on non-first call

我通过创建优化器解决了这个新错误

optimizer = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)

在装饰有@ tf.function的函数之外,一切正常。