目标: 为了能够在神经网络的反向传播计算部分中将二次成本函数从二次熵转换为交叉熵。
问题: 交叉熵成本函数的爆炸梯度,而二次成本函数的函数则符合预期。我认为我执行错了。
想法: 可能需要更改“向后错误”中的代码?
(省略了权重和偏差的更新以节省空间)
def back_prop(self):
# Last layer error (Output)
if self.costfunc == 1:
# Using gradient of cross entropy costfunction for classification
self.error[-1] = (self.active_z[-1] - self.target[np.newaxis,:].T) / sigmoid_derivative(self.active_z[-1])
else:
# Using gradient of Quadratic costfunction for regression
self.error[-1] = (self.active_z[-1] - self.target[np.newaxis,:].T)
# Error in backward layers (Input <-- Hidden)
for i in range(self.hidden_layers-1,-1,-1):
self.error[i] = np.dot(self.error[i+1],self.weights[i+1].T) * sigmoid_derivative(self.active_z[i])
# Calculate partial derivitives of costfunction and update left side
delta_weight_left = np.dot(self.X.T,self.error[0])
delta_bias_left = np.sum(self.error[0],axis=0)
# Regulization and update of left side
delta_weight_left += self.reg_term * self.weights[0]
delta_bias_left += self.reg_term * self.bias[0]
self.weights[0] -= self.learn_r * delta_weight_left
self.bias[0] -= self.learn_r * delta_bias_left