Question

我在tensorflow中实现了DeepMind的DQN算法，并在我调用Ambiguous use of fetchRequest()的行上遇到此错误：

optimizer.minimize(self.loss)

通过阅读有关此错误的其他帖子，我发现这意味着损失功能并不依赖于用于设置模型的任何张量，但在我的代码中我可以＆＃39;看看会是怎样的。 ValueError: No gradients provided for any variable...函数显然取决于对qloss()函数的调用，该函数取决于所有图层张量进行计算。

The model setup code can be viewed here

Answer 1

我发现问题在于，在我的qloss()函数中，我从张量中提取值，对它们进行操作并返回值。虽然这些数值确实取决于张量，但它们并没有在张量中包含在内，所以TensorFlow无法判断它们是否依赖于图中的张量。

我通过更改qloss()来修复此问题，以便它直接在张量上执行操作并返回张量。这是新功能：

def qloss(actions, rewards, target_Qs, pred_Qs):
    """
    Q-function loss with target freezing - the difference between the observed
    Q value, taking into account the recently received r (while holding future
    Qs at target) and the predicted Q value the agent had for (s, a) at the time
    of the update.

    Params:
    actions   - The action for each experience in the minibatch
    rewards   - The reward for each experience in the minibatch
    target_Qs - The target Q value from s' for each experience in the minibatch
    pred_Qs   - The Q values predicted by the model network

    Returns: 
    A list with the Q-function loss for each experience clipped from [-1, 1] 
    and squared.
    """
    ys = rewards + DISCOUNT * target_Qs

    #For each list of pred_Qs in the batch, we want the pred Q for the action
    #at that experience. So we create 2D list of indeces [experience#, action#]
    #to filter the pred_Qs tensor.
    gather_is = tf.squeeze(np.dstack([tf.range(BATCH_SIZE), actions]))
    action_Qs = tf.gather_nd(pred_Qs, gather_is)

    losses = ys - action_Qs
    clipped_squared_losses = tf.square(tf.minimum(tf.abs(losses), 1))

    return clipped_squared_losses

TensorFlow：＆＃39; ValueError：没有为任何变量提供渐变＆＃39;

1 个答案: