我正在使用以下代码在Matlab中使用随机梯度下降进行逻辑回归。培训+测试样本的总数约为600K。代码运行了几个小时。我怎样才能加快速度?
%% load dataset
clc;
clear;
load('covtype.mat');
Data = [X y];
%% Split into testing and training data in a 1:9 split
nRows=size(Data,1);
randRows=randperm(nRows); % generate random ordering of row indices
Test=Data(randRows(1:58101),:); % index using random order
Train=Data(randRows(58102:end),:);
Testx=Test(:,1:54);
Testy=Test(:,55:end);
Trainx=Train(:,1:54);
Trainy=Train(:,55:end);
%% Perform stochastic gradient descent on training data
lambda=0.01; % regularisation constant
alpha=0.01; % step length constant
theta_old = zeros(54,1);
theta_new = theta_old;
z=1;
for count = 1:size(Train,1)
theta_old = theta_new;
theta_new = theta_old + (alpha*Trainy(count)* (1.0 ./ (1.0 + exp(Trainy(count)*(Trainx(count,:)*theta_old)))).*Trainx(count,:))' - alpha*lambda*2*theta_old; %//'
n = norm(theta_new);
llr = lambda*n*n;
count_dummy(z)=count; % dummy variable to store iteration number for plotting later
% calculate log likelihood error for test data with current value of theta_new
for i = 1:size(Test,1)
llr = llr - 1.*log(1.0 + exp(-(Testy(i)*(Testx(i,:)*theta_new))));
end
llr_dummy(z)=llr; % dummy variable to store llr for plotting later
z=z+1;
end
thetaopt = theta_new; % this is optimal theta
%% Plot results on testing data
plot(count_dummy, llr_dummy);
我必须在每次迭代时计算测试数据的对数似然误差来绘制它。我怎样才能加快这段代码的速度?