我在这个数字识别问题上训练我的神经网络但是当我运行优化器函数时,我得到以下错误:
>error: out of memory or dimension too large for Octave's index type
>error: called from
fminunc at line 214 column 12
optimization at line 3 column 18
有关更多说明,请参阅以下代码:
1)优化功能:
>function
[theta_final,cost,accuracy,digit]=optimization(X,y,initial_theta,lambda)
options=optimset('GradObj','on','MaxIter',20);
[theta_final,cost]=fminunc(@(theta)CostCompute(X,y,theta,lambda),initial_theta,options);
2)成本计算
>function[cost,gradient_vector]=CostCompute(X,y,theta,lambda)
input_layer_size=400;
hidden_layer_size=25;
num_labels=size(unique(y));
theta1=reshape(theta(1:(hidden_layer_size*(input_layer_size+1))),hidden_layer_size,input_layer_size+1);#10025, 97.4
theta2=reshape(theta(1+(hidden_layer_size*(input_layer_size+1)):end),num_labels,hidden_layer_size+1);#260, 2.
m=size(X,1);
hidden_layer=sigmoid(theta1*X');
hidden_layer=[ones(1,size(hidden_layer,2));hidden_layer];
output_layer=sigmoid(theta2*hidden_layer);
output=[];
for i=[1:m]
temp=zeros(1,num_labels);
temp(y(i))=1;
output=[output;temp];
endfor
reg=sum(theta.*theta);
cost=-(1/m)*sum(sum(output.*log(output_layer')+((1-output).*log(1-output_layer'))));#+reg*lambda/(2*m);
[gradient_vector]=backprop(X,y,theta1,theta2,num_labels,m,lambda,output);
endfunction
3)反向传播
>function [gradient_vector]=backprop(X,y,theta1,theta2,num_labels,m,lambda,output)
temp_y=zeros(1,num_labels);
hidden_layer=sigmoid(theta1*X');
hidden_layer=[ones(1,size(hidden_layer,2));hidden_layer];
output_layer=sigmoid(theta2*hidden_layer);
delta_output_layer=output_layer-output';
gradient2=delta_output_layer*hidden_layer';#10x25 changes made in hidden_layer
gradient2=gradient2*(1/m);
delta2=(theta2')*delta_output_layer;
delta2=delta2.*sigmoid_gradient(hidden_layer);
delta2=delta2(2:end,:);
gradient1=delta2*X;
gradient1=gradient1*(1/m);
gradient_vector=[gradient1(:);gradient2(:)];
end function
没有一个矩阵的大小超过了内存或八度音阶索引类型的限制。
非常感谢任何帮助。谢谢!