我正在尝试在最后一层使用Softmax在具有输入层,隐藏层和输出层的神经网络中实现反向传播,但我看不出有什么问题。我知道交叉熵损失相对于输出的导数是-目标/输出。我也知道交叉熵相对于最后一层输入的导数是输出-目标,但似乎不收敛。什么 我做错了吗?
def _one_hidden_layer_train(self, inputs, targets):
inputs = np.array(inputs, ndmin = 2).T
targets = np.array(targets, ndmin = 2).T
# Executes a Linear Combination
hidden_inputs = np.dot(self.wih, inputs)
# Executes Activation Function
hidden_outputs = self.hidden_activation_function(hidden_inputs)
# Executes a Linear Combination
final_inputs = np.dot(self.who, hidden_outputs)
# Executes Activation Function
outputs = np.exp(final_inputs - np.max(final_inputs )) / np.sum(np.exp(final_inputs - np.max(final_inputs )))
# output layer error is the (target - actual)
output_errors = (- targets / outputs)
# update the weights for the links between the hidden and output layers
self.who += self.learning_rate * np.dot((outputs - targets), hidden_outputs.T)
# first hidden layer error is the second_hidden_errors, split by weights, recombined at first hidden layer nodes
hidden_errors = np.dot(self.who.T, output_errors)
# update the weights for the links between the first hidden layer and input layer
self.wih += self.learning_rate * np.dot((hidden_errors * self.hidden_activation_derivative(hidden_inputs)), inputs.T)