我将首先总结一下我认为我对cuDNN 5.1 rnn函数的理解:
张量尺寸
x = [seq_length, batch_size, vocab_size] # input
y = [seq_length, batch_size, hiddenSize] # output
dx = [seq_length, batch_size, vocab_size] # input gradient
dy = [seq_length, batch_size, hiddenSize] # output gradient
hx = [num_layer, batch_size, hiddenSize] # input hidden state
hy = [num_layer, batch_size, hiddenSize] # output hidden state
cx = [num_layer, batch_size, hiddenSize] # input cell state
cy = [num_layer, batch_size, hiddenSize] # output cell state
dhx = [num_layer, batch_size, hiddenSize] # input hidden state gradient
dhy = [num_layer, batch_size, hiddenSize] # output hidden state gradient
dcx = [num_layer, batch_size, hiddenSize] # input cell state gradient
dcy = [num_layer, batch_size, hiddenSize] # output cell state gradient
w = [param size] # parameters (weights & bias)
dw = [param size] # parameters gradients
cudnnRNNForwardTraining / cudnnRNNForwardInference
input: x, hx, cx, w
output: y, hy, cy
cudnnRNNBackwardData
input: y, dy, dhy, dcy, w, hx, cx
output: dx, dhx, dcx
cudnnRNNBackwardWeights
input: x, hx, y, dw
output: dw
问题:
- init hx,cx,dhy,dcy为NULL
- init w :(权重:小的随机值,偏差:1)
- 向前
- 向后数据
- 落后权重
- 更新权重:w + = dw
- dw = 0
- 转到3。
醇>
发布同一问题here(我会同步答案)