我正在尝试将我的一些代码从MatLab移植到Python中并遇到scipy.optimize.fmin_cg
函数的问题 - 这是我目前的代码:
我的费用函数:
def nn_costfunction2(nn_params,*args):
Theta1, Theta2 = reshapeTheta(nn_params)
input_layer_size, hidden_layer_size, num_labels, X, y, lam = args[0], args[1], args[2], args[3], args[4], args[5]
m = X.shape[0] #Length of vector
X = np.hstack((np.ones([m,1]),X)) #Add in the bias unit
layer1 = sigmoid(Theta1.dot(np.transpose(X))) #Calculate first layer
layer1 = np.vstack((np.ones([1,layer1.shape[1]]),layer1)) #Add in bias unit
layer2 = sigmoid(Theta2.dot(layer1))
y_matrix = np.zeros([y.shape[0],layer2.shape[0]]) #Create a matrix where vector position of one corresponds to label
for i in range(y.shape[0]):
y_matrix[i,y[i]-1] = 1
#Cost function
J = (1/m)*np.sum(np.sum(-y_matrix.T.conj()*np.log(layer2),axis=0)-np.sum((1-y_matrix.T.conj())*np.log(1-layer2),axis=0))
#Add in regularization
J = J+(lam/(2*m))*np.sum(np.sum(Theta1[:,1:].conj()*Theta1[:,1:])+np.sum(Theta2[:,1:].conj()*Theta2[:,1:]))
#Backpropagation with vectorization and regularization
delta_3 = layer2 - y_matrix.T
r2 = delta_3.T.dot(Theta2[:,1:])
z_2 = Theta1.dot(X.T)
delta_2 = r2*sigmoidGradient(z_2).T
t1 = (lam/m)*Theta1[:,1:]
t1 = np.hstack((np.zeros([t1.shape[0],1]),t1))
t2 = (lam/m)*Theta2[:,1:]
t2 = np.hstack((np.zeros([t2.shape[0],1]),t2))
Theta1_grad = (1/m)*(delta_2.T.dot(X))+t1
Theta2_grad = (1/m)*(delta_3.dot(layer1.T))+t2
nn_params = np.hstack([Theta1_grad.flatten(),Theta2_grad.flatten()]) #Unroll parameters
return nn_params
我对函数的调用:
args = (input_layer_size, hidden_layer_size, num_labels, X, y, lam)
fmin_cg(nn_costfunction2,nn_params, args=args,maxiter=50)
给出以下错误:
File "C:\WinPython3\python-3.3.2.amd64\lib\site-packages\scipy\optimize\optimize.py", line 588, in approx_fprime
grad[k] = (f(*((xk+d,)+args)) - f0) / d[k]
ValueError: setting an array element with a sequence.
我在向fmin_cg传递参数时尝试了各种排列,但这是我得到的最远的。单独运行成本函数不会在此表单中引发任何错误。
答案 0 :(得分:1)
成本函数中的输入变量应该是一维数组。因此,Theta1
中的Theta2
和J
必须来自nn_params
。您还需要return J
。
答案 1 :(得分:1)
尝试在函数调用中添加epsilon参数:
fmin_cg(nn_costfunction2,nn_params, args=args,epsilon,maxiter=50)
答案 2 :(得分:0)
我看到这个问题是因为你让nnCostFunction2返回成本和毕业。
但是scipy.optimize.fmin_cg函数只需要nnCostFunction2的单个成本输出。
因此,保留nnCostFunction2函数的单个J或成本输出。
这是我正在运作的功能:
scipy.optimize.fmin_cg(nnCostFunction, initial_rand_theta, backpropagate, \
args=(hidden_s, input_s, num_labels, X, y, lamb), maxiter=1000, \
disp=True, full_output=True)