Question

我正在尝试在最后一层使用Softmax在具有输入层，隐藏层和输出层的神经网络中实现反向传播，但我看不出有什么问题。我知道交叉熵损失相对于输出的导数是-目标/输出。我也知道交叉熵相对于最后一层输入的导数是输出-目标，但似乎不收敛。什么我做错了吗？

def _one_hidden_layer_train(self, inputs, targets):

    inputs = np.array(inputs, ndmin = 2).T
    targets = np.array(targets, ndmin = 2).T

    # Executes a Linear Combination
    hidden_inputs = np.dot(self.wih, inputs)
    # Executes Activation Function
    hidden_outputs = self.hidden_activation_function(hidden_inputs)

    # Executes a Linear Combination
    final_inputs = np.dot(self.who, hidden_outputs)
    # Executes Activation Function
    outputs = np.exp(final_inputs  - np.max(final_inputs )) / np.sum(np.exp(final_inputs  - np.max(final_inputs )))

    # output layer error is the (target - actual)
    output_errors = (- targets / outputs)

    # update the weights for the links between the hidden and output layers
    self.who += self.learning_rate * np.dot((outputs - targets), hidden_outputs.T)

    # first hidden layer error is the second_hidden_errors, split by weights, recombined at first hidden layer nodes
    hidden_errors = np.dot(self.who.T, output_errors)

    # update the weights for the links between the first hidden layer and input layer
    self.wih += self.learning_rate * np.dot((hidden_errors * self.hidden_activation_derivative(hidden_inputs)), inputs.T)

使用Softmax进行反向传播

0 个答案: