我想计算循环网络的frac{\partial L}{\partial x_t}
,其中L
是成本函数,而x_t
是时间步长{{1}时循环网络的内部状态}。是否可以使用TensorFlow做到这一点?
我尝试了以下简单示例。假设t
,其中L=(x_T - x_{target})^2
是在x_T
时间步长的网络状态,而Tth
是给定的目标值。
我的代码是:
x_{target}
但是,desired_gradient在时间# the shape of state_series is (time_n, batch_size, rnn_n)
state_series, last_state = rnn.dynamic_rnn(rnn_cell, input, time_major=True)
# cost_mask is a one-dimensional tensor in which the Tth element is one, and all the other elements are zero.
cost_fcn = tf.reduce_mean(tf.reduce_mean((state_series - state_target)**2, axis=[1,2]) * cost_mask)
desired_gradient = tf.gradients(cost_fcn, state_series)
之前为零(即其值为T
,其中[a_0, a_1, a_2, ..., a_{T-1}, a_T]
和a_0 = a_1 = a_2 = ... = a_{T-1} = 0
)。这意味着TensorFlow在计算梯度时不会考虑时间反向传播。如何解决这个问题呢?谢谢。