错误:对于Octave的索引类型

时间:2017-06-21 14:01:21

标签: matlab octave

我在这个数字识别问题上训练我的神经网络但是当我运行优化器函数时,我得到以下错误:

    >error: out of memory or dimension too large for Octave's index type
    >error: called from
        fminunc at line 214 column 12
        optimization at line 3 column 18

有关更多说明,请参阅以下代码:

1)优化功能:

    >function 
    [theta_final,cost,accuracy,digit]=optimization(X,y,initial_theta,lambda)
    options=optimset('GradObj','on','MaxIter',20);
    [theta_final,cost]=fminunc(@(theta)CostCompute(X,y,theta,lambda),initial_theta,options);

2)成本计算

    >function[cost,gradient_vector]=CostCompute(X,y,theta,lambda)
    input_layer_size=400;
    hidden_layer_size=25;
    num_labels=size(unique(y));
    theta1=reshape(theta(1:(hidden_layer_size*(input_layer_size+1))),hidden_layer_size,input_layer_size+1);#10025, 97.4
    theta2=reshape(theta(1+(hidden_layer_size*(input_layer_size+1)):end),num_labels,hidden_layer_size+1);#260, 2.
    m=size(X,1);
    hidden_layer=sigmoid(theta1*X');
    hidden_layer=[ones(1,size(hidden_layer,2));hidden_layer];
    output_layer=sigmoid(theta2*hidden_layer); 
    output=[];
    for i=[1:m]
       temp=zeros(1,num_labels);
       temp(y(i))=1;
       output=[output;temp];
        endfor
        reg=sum(theta.*theta);
        cost=-(1/m)*sum(sum(output.*log(output_layer')+((1-output).*log(1-output_layer'))));#+reg*lambda/(2*m);
        [gradient_vector]=backprop(X,y,theta1,theta2,num_labels,m,lambda,output);
        endfunction

3)反向传播

    >function [gradient_vector]=backprop(X,y,theta1,theta2,num_labels,m,lambda,output)   
        temp_y=zeros(1,num_labels);
        hidden_layer=sigmoid(theta1*X');
        hidden_layer=[ones(1,size(hidden_layer,2));hidden_layer];
        output_layer=sigmoid(theta2*hidden_layer); 
         delta_output_layer=output_layer-output';
        gradient2=delta_output_layer*hidden_layer';#10x25 changes made in hidden_layer
        gradient2=gradient2*(1/m);
        delta2=(theta2')*delta_output_layer;
        delta2=delta2.*sigmoid_gradient(hidden_layer);
        delta2=delta2(2:end,:);
        gradient1=delta2*X;
        gradient1=gradient1*(1/m); 
        gradient_vector=[gradient1(:);gradient2(:)];
    end function

没有一个矩阵的大小超过了内存或八度音阶索引类型的限制。

非常感谢任何帮助。谢谢!

0 个答案:

没有答案