如果我们按如下方式表示展开的RNN
final_state, output = Unroll_RNN_10_Steps(initial_state, input)
如何根据final_state
计算initial_state
的雅可比行列式?
答案 0 :(得分:1)
正在开展更快的方法,但从TF 1.5开始,这里有一些示例代码:
batch_size = 2
steps = 10
units = 3
cell = tf.contrib.rnn.BasicLSTMCell(units)
initial_state = cell.zero_state(batch_size, tf.float32)
inputs = [tf.random_uniform([batch_size, units]) for _ in
range(steps)]
with tf.contrib.eager.GradientTape(persistent=True) as g:
g.watch(initial_state.c)
state = initial_state
for i in range(steps):
_, state = cell(inputs[i], state)
# Split the final state into scalar tensors
# so that we can compute gradients with respect to each
# scalar below.
states = tf.split(state.c, units, axis=1)
# Compute the gradients of each scalar in the final state
# with respect to initial state for each example in the batch.
# Each element in grads has shape [batch_size, units]
grads = [g.gradient(states[i], [initial_state.c])[0]
for i in range(units)]
# Stack grads so that their shape is [units, batch_size, units]
grads = tf.stack(grads)
# reshape grads to [batch_size, units, units] so that
# jacobian[b, :, :] is the Jacobian of b'th example in the batch
jacobian = tf.transpose(grads, perm=[1, 0, 2])
print("Jacobian shape: " + str(jacobian.shape))