计算偏差隐藏层

时间:2018-08-25 07:05:51

标签: python python-3.x machine-learning neural-network deep-learning

我在python中开发了一个小代码,该代码使用4个神经元(2个输入,3个隐藏层中的神经元和1个输出神经元),该代码确实是特定原因,我想仔细了解每个操作。 它可以工作,但我仍然有偏见的一个问题!

for epoch in range(epochs):
    layer1, predictions = predict_output_neural(features, weights_11, weights_12, weights_13, weight_ouput, bias_11, bias_12, bias_13, bias_output)
    if epoch % 10 == 0:
        layer1, predictions = predict_output_neural(features, weights_11, weights_12, weights_13, weight_ouput, bias_11, bias_12, bias_13, bias_output)
        print (cost(predictions, targets))
    """
        There are a lot of things to do here !
        to do the back propagation, we will first train the ouput neural
    """
    #Init gradient
    weights_gradient_output = np.zeros(weight_ouput.shape)
    bias_gradient_output = 0

    weights_gradient_11 = np.zeros(weights_11.shape)
    bias_gradient_11 = 0

    weights_gradient_12 = np.zeros(weights_12.shape)
    bias_gradient_12 = 0

    weights_gradient_13 = np.zeros(weights_12.shape)
    bias_gradient_13 = 0
    #Go throught each row
    for neural_input, feature, target, prediction in zip(layer1, features, targets, predictions):

        output_error = prediction - target
        output_delta = output_error * derivative_activation_y(prediction)

        error_neural_hidden_11 = output_delta * weight_ouput[0]
        error_neural_hidden_12 = output_delta * weight_ouput[1]
        error_neural_hidden_13 = output_delta * weight_ouput[2]


        error_neural_11 = error_neural_hidden_11 * derivative_activation_y(neural_input[0])
        error_neural_12 = error_neural_hidden_12 * derivative_activation_y(neural_input[1])
        error_neural_13 = error_neural_hidden_13 * derivative_activation_y(neural_input[2])

        weights_gradient_output += neural_input * output_delta
        #bias_output += output_delta

        weights_gradient_11 += feature * error_neural_11
        #bias_11 += error_neural_11

        weights_gradient_12 += feature * error_neural_12
        #bias_12 += error_neural_12

        weights_gradient_13 += feature * error_neural_13
        #bias_13 += error_neural_13


    #Update the weights and bias
    weight_ouput = weight_ouput - (learning_rate * weights_gradient_output)
    bias_output = bias_output - (learning_rate * bias_gradient_output)
    weights_11 =  weights_11 - (learning_rate * weights_gradient_11)
    bias_11 =  bias_11 - (learning_rate * bias_gradient_11)
    weights_12 =  weights_12 - (learning_rate * weights_gradient_12)
    bias_12 =  bias_12 - (learning_rate * bias_gradient_12)
    weights_13 =  weights_13 - (learning_rate * weights_gradient_13)
    bias_13 =  bias_13 - (learning_rate * bias_gradient_13)

这给了我很好的结果,但是一旦我取消注释以修改每个神经元的偏见的行,那就超级错误了!它收敛到0.5(例如0,4999999)

你知道为什么吗?看起来偏置梯度的更新很好,不是吗?

1 个答案:

答案 0 :(得分:1)

如果您在此处查看梯度累积代码,

    weights_gradient_output += neural_input * output_delta
    #bias_output += output_delta

您是将渐变直接添加到偏差而不是添加到bias_gradient_output。因此,偏差更新使用的学习率为1,这可能比您预期的要高。 (与bias_11类似的问题,等等)。