Question

在运行反向传播后计算神经网络的delta值时：

delta（1）的值将是一个标量值，它应该是一个向量？

更新：

取自http://www.holehouse.org/mlclass/09_Neural_Networks_Learning.html

具体来说：

Answer 1

首先，您可能了解在每一层中，我们都需要学习n x m个参数（或权重），因此它会形成一个二维矩阵。

n is the number of nodes in the current layer plus 1 (for bias)
m is the number of nodes in the previous layer.

我们有n x m个参数，因为前一层和当前层之间的两个节点之间有一个连接。

我很确定L层的Delta（大三角洲）用于累积层L上每个参数的偏导数项。所以你在每一层都有一个Delta矩阵。更新矩阵的第i行（当前层中的第i个节点）和第j列（前一层中的第j个节点），

D_(i,j) = D_(i,j) + a_j * delta_i
note a_j is the activation from the j-th node in previous layer,
     delta_i is the error of the i-th node of the current layer
so we accumulate the error proportional to their activation weight.

因此，为了回答你的问题，Delta应该是一个矩阵。

梯度下降：delta值应该是标量还是向量？

1 个答案: