Question

我正在实施我的第一个神经网络作为我的高中毕业论文。使用 MNIST 数据集进行训练时，我得到了很好的结果。但这只是当我只使用 1 个隐藏层时，如果我在训练后尝试使用 1 个以上的隐藏层总是给出相同的输出。我尝试重新计算超过一层的误差函数的导数，但我一定遗漏了一些东西...... 这是我的反向传播方法的代码：

    public void BackPropagation(double[] error, bool batch)
    {
        double[][] temp = null;
        temp = NNMath.ArrayToMatrix(NNMath.EntryWiseProduct(error, NNMath.SigmoidDerivativeFromSigmoid(this.A[this.A.Length - 1])));
        this.DW[this.DW.Length - 1] = NNMath.TransposeMatrix(NNMath.DotProduct(NNMath.TransposeMatrix(temp), NNMath.ArrayToMatrix(this.A[this.DW.Length - 1])));
        temp[0].CopyTo(this.DB[this.DB.Length - 1], 0);

        for (int i = this.W.Length - 1; i > 0; i--)
        {
            temp = NNMath.DotProduct(temp, NNMath.TransposeMatrix(this.W[i]));
            temp = NNMath.EntryWiseProduct(temp, NNMath.ArrayToMatrix(NNMath.SigmoidDerivativeFromSigmoid(this.A[i])));
            if (batch)
            {
                this.DW[i - 1] = NNMath.EntryWiseSum(this.DW[i - 1], NNMath.DotProduct(NNMath.TransposeMatrix(this.A[i - 1]), temp));
                this.DB[i - 1] = NNMath.EntryWiseSum(this.DB[i - 1], temp[0]);
            }
            else
            {
                this.DW[i - 1] = NNMath.DotProduct(NNMath.TransposeMatrix(this.A[i - 1]), temp);
                temp[0].CopyTo(this.DB[i - 1], 0);
            }
        }
    }

我创建了一个名为 NNMath 的静态类，用于进行矩阵运算。

this.A 是一个 2 维数组，每一行代表一个激活层。
this.W 是一个 3 维数组，其中每个元素都是 2 层之间权重的矩阵。
this.DW 与 this.W 相同，但包含计算的导数
this.DB 是一个包含偏差导数的二维数组
batch 如果在批量训练期间调用该方法，则为 true

我使用 MSE 作为损失函数。

提前致谢！

编辑： 这是来自 NNMath 的更多代码

    public static double[] EntryWiseSum(double[] a, double[] b)
    {
        if (a.Length != b.Length)
            return null;
        double[] c = new double[a.Length];
        for (int i = 0; i < a.Length; i++)
                c[i] = a[i] + b[i];
        return c;
    }

    public static double SigmoidDerivativeFromSigmoid(double sigmoidA)
    {
        return sigmoidA * (1.0 - sigmoidA);
    }

    public static double[] SigmoidDerivativeFromSigmoid(double[] a)
    {
        double[] res = new double[a.Length];
        for (int i = 0; i < a.Length; i++)
            res[i] = SigmoidDerivativeFromSigmoid(a[i]);
        return res;
    }

Answer 1

我发现了我的错误。如果有人想知道，这里是更正的方法。

    public void BackPropagationNew(double[] error, bool batch)
    {
        double[][] temp = null;
        temp = NNMath.ArrayToMatrix(NNMath.EntryWiseProduct(error, NNMath.SigmoidDerivativeFromSigmoid(this.A[this.A.Length - 1])));
        if (batch)
        {
            this.DW[this.DW.Length - 1] = NNMath.EntryWiseSum(this.DW[this.DW.Length - 1],  NNMath.TransposeMatrix(NNMath.DotProduct(NNMath.TransposeMatrix(temp), NNMath.ArrayToMatrix(this.A[this.DW.Length - 1]))));
            this.DB[this.DB.Length - 1] = NNMath.EntryWiseSum(this.DB[this.DB.Length - 1], temp[0]);
        }
        else
        {
            this.DW[this.DW.Length - 1] = NNMath.TransposeMatrix(NNMath.DotProduct(NNMath.TransposeMatrix(temp), NNMath.ArrayToMatrix(this.A[this.DW.Length - 1])));
            temp[0].CopyTo(this.DB[this.DB.Length - 1], 0);
        }

        for (int i = this.W.Length - 1; i > 0; i--)
        {
            temp = NNMath.DotProduct(temp, NNMath.TransposeMatrix(this.W[i]));
            temp = NNMath.EntryWiseProduct(temp, NNMath.ArrayToMatrix(NNMath.SigmoidDerivativeFromSigmoid(this.A[i])));
            if (batch)
            {
                this.DW[i - 1] = NNMath.EntryWiseSum(this.DW[i - 1], NNMath.DotProduct(NNMath.TransposeMatrix(this.A[i - 1]), temp));
                this.DB[i - 1] = NNMath.EntryWiseSum(this.DB[i - 1], temp[0]);
            }
            else
            {
                this.DW[i - 1] = NNMath.DotProduct(NNMath.TransposeMatrix(this.A[i - 1]), temp);
                temp[0].CopyTo(this.DB[i - 1], 0);
            }
        }
    }

因为我使用的是小批量训练，而我忘记检查它是否在第一次权重更改的批次中，所以它基本上没有改变权重。无论如何，感谢任何试图帮助我的人！下次我会尽量小心点。

当隐藏层超过 1 个时，神经网络无法学习

1 个答案: