我正在尝试使用前馈和反向传播实现3层神经网络。
我已经测试了成本函数,并且工作正常。我的渐变函数也没问题。
但是当我尝试使用fmin_cg
中的scipy
优化变量时,出现此警告:
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 4.643489
Iterations: 1
Function evaluations: 123
Gradient evaluations: 110
我搜索了此内容,有人告诉问题是gradient
。这是我的渐变代码:
theta_flatten = theta_flatten.reshape(1,-1)
# retrieve theta values from flattened theta
theta_hidden = theta_flatten[0,0:((input_layer_size+1)*hidden_layer_size)]
theta_hidden = theta_hidden.reshape((input_layer_size+1),hidden_layer_size)
theta_output = theta_flatten[0,((input_layer_size+1)*hidden_layer_size):]
theta_output = theta_output.reshape(hidden_layer_size+1,num_labels)
# start of section 1
a1 = x # 5000x401
z2 = np.dot(a1,theta_hidden) # 5000x25
a2 = sigmoid(z2)
a2 = np.append(np.ones(shape=(a1.shape[0],1)),a2,axis = 1) # 5000x26 # adding column of 1's to a2
z3 = np.dot(a2,theta_output) # 5000x10
a3 = sigmoid(z3) # a3 = h(x) w.r.t theta
a3 = rotate_column(a3) # mapping 0 to "0" instead of 0 to "10"
# end of section 1
# strat of section 2
delta3 = a3 - y # 5000x10
# end of section 2
# start of section 3
delta2 = (np.dot(delta3,theta_output.transpose()))[:,1:] # 5000x25 # drop delta2(0)
delta2 = delta2*sigmoid_gradient(z2)
# end of section 3
# start of section 4
DELTA2 = np.dot(a2.transpose(),delta3) # 26x10
DELTA1 = np.dot(a1.transpose(),delta2) # 401x25
# end of section 4
# start of section 5
theta_hidden_ = np.append(np.ones(shape=(theta_hidden.shape[0],1)),theta_hidden[:,1:],axis = 1) # regularization
theta_output_ = np.append(np.ones(shape=(theta_output.shape[0],1)),theta_output[:,1:],axis = 1) # regularization
D1 = DELTA1/a1.shape[0] + (theta_hidden_*lambda_)/a1.shape[0]
D2 = DELTA2/a1.shape[0] + (theta_output_*lambda_)/a1.shape[0]
# end of section 5
Dvec = np.append(D1,D2)
return Dvec
我在github上寻找其他人的实现,但是没有任何效果,他们像我一样实现。
一些评论:
第一部分:前馈实现
第二部分至第四:从输出层到输入层的反向传播
第五部分:汇总渐变
请帮忙
谢谢