我正在尝试使用交叉熵和softmax通过神经网络训练我的数据。 有2个具有100个神经元的隐藏层和一个具有10个神经元的输出层。 我在更新反向传播中的权重时遇到问题。该代码失败了,因为我试图将100个神经元添加到10个神经元。我的代码的逻辑一定是错误的,但是我无法确定在哪里。 self.weights_hidden_output具有10个值,target_vector,output_vector2和output_network也具有10个值。而tmp3,output_vector1和output_hidden,梯度为100
def train_single(self, input_vector, target_vector):
input_vector = np.array(input_vector, ndmin=2).T
target_vector = np.array(target_vector, ndmin=2).T
output_vector1 = np.dot(self.weights_in_hidden, input_vector)
output_hidden = Activation.reLU(output_vector1)
output_vector2 = np.dot(self.weights_hidden_output, output_hidden)
output_network = Activation.reLU(output_vector2)
loss = Cross_Entropy.calc(output_network, target_vector)
gradient = Cross_Entropy.derived_calc(output_hidden, target_vector)
tmp1 = loss * gradient
# update the weights:
derived1 = Derivative.reLU(gradient)
tmp2 = derived1 * tmp1
tmp3 = self.learning_rate * np.dot(tmp2, output_hidden.T)
# TODO - fix this bug, exception caused by dimensions ######################################
self.weights_hidden_output += tmp3
############################################################################################
# calculate hidden errors:
hidden_errors = np.dot(self.weights_hidden_output.T, loss)
# ----------------------------------------------------------------------
# update the weights:
tmp = hidden_errors * Derivative.reLU(output_hidden)
# -----------------------------------------------------------------------
self.weights_in_hidden += self.learning_rate * np.dot(tmp, input_vector.T)