反向传播不起作用:神经网络Java

时间:2016-12-09 13:23:32

标签: java matrix neural-network backpropagation sigmoid

我根据这个python示例创建了一个包含3层的简单神经网络:Link(PS:你必须向下滚动直到你到达第2部分)

这是我对代码的Java实现:

private void trainNet()
{
    // INPUT is a 4*3 matrix
    // SYNAPSES is a 3*4 matrix
    // SYNAPSES2 is a 4*1 matrix
    // 4*3 matrix DOT 3*4 matrix => 4*4 matrix: unrefined test results
    double[][] layer1 = sigmoid(dot(inputs, synapses), false);

    // 4*4 matrix DOT 4*1 matrix => 4*1 matrix: 4 final test results
    double[][] layer2 = sigmoid(dot(layer1, synapses2), false);

    // 4*1 matrix - 4*1 matrix => 4*1 matrix: error of 4 test results
    double[][] layer2Error = subtract(outputs, layer2);

    // 4*1 matrix DOT 4*1 matrix => 4*1 matrix: percentage of change of 4 test results
    double[][] layer2Delta = dot(layer2Error, sigmoid(layer2, true));

    // 4*1 matrix DOT 3*1 matrix => 4*1 matrix
    double[][] layer1Error = dot(layer2Delta, synapses2);

    // 4*1 matrix DOT 4*4 matrix => 4*4 matrix: percentage of change of 4 test results
    double[][] layer1Delta = dot(layer1Error, sigmoid(layer1, true));

    double[][] transposedInputs = transpose(inputs);
    double[][] transposedLayer1 = transpose(layer1);

    //  4*4 matrix DOT 4*1 matrix => 4*1 matrix: the updated weights
    // Update the weights
    synapses2 = sum(synapses2, dot(transposedLayer1, layer2Delta));

    // 3*4 matrix DOT 4*4 matrix => 3*4 matrix: the updated weights
    // Update the weights
    synapses = sum(synapses, dot(transposedInputs, layer1Delta));

    // Test each value of two 4*1 matrices with each other
    testValue(layer2, outputs);
}

我自己创造的点,总和,减法和转置函数,我很确定它们完美地完成了它们的工作。

第一批输入给出了大约0.4的误差,这是正常的,因为权重是随机值。在第二次运行时,误差范围较小,但只有极小的量(0.001)

在500,000批次(总共2,000,000次测试)后,网络仍然没有给出任何正确的值!所以我尝试使用更大量的批次。使用1,000,000批次(总共4,000,000次测试),网络产生了高达16,900的正确结果。

有人可以告诉我发生了什么事吗?

这些是用过的重量:

第一层:

  • 2.038829298171684 2.816232761170282 1.6740269469812146 1.634422766238497
  • 1.5890997594993828 1.7909325329112222 2.101840236824494 1.063579126586681
  • 3.761238407071311 3.757148454039234 3.7557450538398176 3.6715972104291605

第二层:

  • -0.019603811941904248
  • 218.38253323323553
  • 53.70133275445734
  • -272.83589796861514

    编辑: 感谢lsnare指出我使用库会更容易!

对于那些感兴趣的人是使用math.nist.gov/javanumerics库的工作代码:

private void trainNet()
{
    // INPUT is a 4*3 matrix
    // SYNAPSES is a 3*4 matrix
    // SYNAPSES2 is a 4*1 matrix
    // 4*3 matrix DOT 3*4 matrix => 4*4 matrix: unrefined test results
    Matrix hiddenLayer = sigmoid(inputs.times(synapses), false);

    // 4*4 matrix DOT 4*1 matrix => 4*1 matrix: 4 final test results
    Matrix outputLayer = sigmoid(hiddenLayer.times(synapses2), false);

    // 4*1 matrix - 4*1 matrix => 4*1 matrix: error of 4 test results
    Matrix outputLayerError = outputs.minus(outputLayer);

    // 4*1 matrix DOT 4*1 matrix => 4*1 matrix: percentage of change of 4 test results
    Matrix outputLayerDelta = outputLayerError.arrayTimes(sigmoid(outputLayer, true));

    // 4*1 matrix DOT 1*4 matrix => 4*4 matrix
    Matrix hiddenLayerError = outputLayerDelta.times(synapses2.transpose());

    // 4*4 matrix DOT 4*4 matrix => 4*4 matrix: percentage of change of 4 test results
    Matrix hiddenLayerDelta = hiddenLayerError.arrayTimes(sigmoid(hiddenLayer, true));

    //  4*4 matrix DOT 4*1 matrix => 4*1 matrix: the updated weights
    // Update the weights
    synapses2 = synapses2.plus(hiddenLayer.transpose().times(outputLayerDelta));

    // 3*4 matrix DOT 4*4 matrix => 3*4 matrix: the updated weights
    // Update the weights
    synapses = synapses.plus(inputs.transpose().times(hiddenLayerDelta));

    // Test each value of two 4*1 matrices with each other
    testValue(outputLayer.getArrayCopy(), outputs.getArrayCopy());
}

1 个答案:

答案 0 :(得分:0)

通常,在编写涉及高级数学或数值计算(例如线性代数)的代码时,最好使用现场专家编写的现有库,而不是编写自己的函数。标准库将产生更准确的结果,并且更有效。例如,在您引用的博客中,作者使用numpy库来计算点积和矩阵的转置。对于Java,您可以使用由NIST开发的Java Matrix Package(JAMA):http://math.nist.gov/javanumerics/jama/
例如,转置矩阵:

double[4][3] in = {{0,0,1},{0,1,1},{1,0,1},{1,1,1}};
Matrix input = new Matrix(in);
input = input.transpose();

我不确定这是否能完全解决您的问题,但希望这可以帮助您节省未来编写额外代码的费用。