Question

感谢您阅读此帖！

RNN爱好者的快速提问：

我知道在反向推进时间（ BPPT ）中，至少有3个步骤：

For each element in a sequence :
Step 1 - Compute 'error ratio' of each neuron, from upper layer to lower layer.
Step 2 - Compute a 'weight delta' for each weight (X) using the error ratio mentionned in step 1, and push it into an array

After sequence is finished :
Step 3 - Sum all weight deltas of weight (X) and add it to current value of weight (X)

我现在正尝试从此处的文档中实施一个发条RNN（CW RNN）： http://jmlr.org/proceedings/papers/v32/koutnik14.pdf

根据我的观点，隐藏层中的每个“模块”都具有相同数量的神经元，只是一个不同的时钟。

CW RNN的正向传递似乎非常简单直观然而，对于向后传球，这是一个不同的故事。

引用文档：

The backward pass of the error propagation is similar to
SRN as well. The only difference is that the error propagates
only from modules that were executed at time step t. The
error of non-activated modules gets copied back in time
(similarly to copying the activations of nodes not activated
at the time step t during the corresponding forward pass),
where it is added to the back-propagated error.

这是我感到困惑的地方。

上述哪些反向传播步骤适用于隐藏层中的未激活模块？
（一个模块，它的时钟MOD时间步长！= 0 ）

step1 ， step2 ，还是同时？

再次感谢您的帮助！

Answer 1

我不确定您的BPTT算法（如果您可以提供参考，我可能会尝试更好地理解它）。

但仔细观察图2和等式（1）和（2）后，未激活的模块应该只是通过时间向下传递梯度。这意味着不计算渐变（对于非活动模块），只是传递时间t时刻t-1的渐变值。

所以我不会猜测step1和step2，只是复制前一步的值。

发条神经网络（CW RNN）

1 个答案: