多层神经网络不学习

时间:2021-01-29 13:25:40

标签: java neural-network backpropagation

我编写了一个多层神经网络并使用误差反向传播算法对其进行训练。问题是网络没有学习,经过一些迭代后网络的最终误差开始从非常小的负数跳到大的正数。我正在使用 Sigmoid 单极函数。输入层30个神经元,隐藏层17个神经元,外层神经元10个。

这是全网误差计算方法:(error = OutLayer out value * (1 - OutLayer out value) * (expected value - OutLayer out value)

public static Matrix errorCal(Matrix expected, Matrix our){
    Matrix temp = new Matrix(expected.rows, expected.cols);
    for(int i=0;i<expected.cols;i++){
        for(int j=0;j<expected.rows;j++){
            temp.data[j][i]= our.data[j][i] * (1 - our.data[j][i]) * (expected.data[j][i] - our.data[j][i]);
        }
    }
    return temp;
}

隐藏层错误的方法(hiddenError = input to hidden layer * (1 - input to hidden layer) * (average out layer error * hidden layer out)

public static Matrix backPropError(Matrix hiddenOut, Matrix outError, Matrix inputOut){
    Matrix temp = new Matrix(hiddenOut.rows, 1, true);

    double avgOutError = 0;
    for(int i =0; i <outError.toArray().size();i++)
        avgOutError += outError.toArray().get(i);

    avgOutError = avgOutError/outError.toArray().size();

    for(int i =0;i<temp.cols;i++){
        for(int j=0;j<temp.rows;j++) {
            temp.data[j][i] = (inputOut.data[j][i] * (1 - inputOut.data[j][i] )* (avgOutError * hiddenOut.data[j][i]));
        }
    }
    return temp;
}

权重实现方法(新权重=旧权重+(学习率*误差*层出))(层出取决于实现了哪一层权重)

public static Matrix weightActualization(Matrix weights, Matrix error, double learningRate, Matrix hiddenLayerOut){
    Matrix temp = new Matrix(weights.rows, weights.cols);

    Matrix one = null;
    Matrix two = null;
    Matrix three = null;

    if(weights.toArray().size() == 51) {

        one = new Matrix(17, 1);
        two = new Matrix(17, 1);
        three = new Matrix(17, 1);
    }else if (weights.toArray().size() == 30){
        one = new Matrix(10, 1);
        two = new Matrix(10, 1);
        three = new Matrix(10, 1);
    }

    for(int i=0;i<one.cols;i++){
        for(int j=0;j<one.rows;j++){
            one.data[j][i]= weights.data[j][i] + (learningRate*error.data[j][i]* hiddenLayerOut.data[j][i]);
            temp.data[j][i] = one.data[j][i];
        }
    }

    for(int i=0;i<two.cols;i++){
        for(int j=0;j<two.rows;j++){
            two.data[j][i]= weights.data[j][i+1] + (learningRate*error.data[j][i]*hiddenLayerOut.data[j][i]);
            temp.data[j][i+1] = two.data[j][i];
        }
    }

    for(int i=0;i<three.cols;i++){
        for(int j=0;j<three.rows;j++){
            three.data[j][i]= weights.data[j][i+2] + (learningRate*error.data[j][i]*hiddenLayerOut.data[j][i]);
            temp.data[j][i+2] = three.data[j][i];
        }
    }

    return temp;
}

0 个答案:

没有答案