我已经使用矢量化实现了以下梯度下降代码,但似乎成本函数没有正确递减。相反,成本函数随着每次迭代而增加。
假设θ为n + 1向量,y为m向量,X为设计矩阵m *(n + 1)
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
n = length(theta); % number of features
J_history = zeros(num_iters, 1);
error = ((theta' * X')' - y)*(alpha/m);
descent = zeros(size(theta),1);
for iter = 1:num_iters
for i = 1:n
descent(i) = descent(i) + sum(error.* X(:,i));
i = i + 1;
end
theta = theta - descent;
J_history(iter) = computeCost(X, y, theta);
disp("the value of cost function is : "), disp(J_history(iter));
iter = iter + 1;
end
计算成本函数是:
function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
H = theta' * X(i,:)';
E = H - y(i);
SQE = E^2;
J = (J + SQE);
i = i+1;
end;
J = J / (2*m);
答案 0 :(得分:3)
您可以进一步向量化:
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y);
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
delta = (theta' * X'-y')*X;
theta = theta - alpha/m*delta';
J_history(iter) = computeCost(X, y, theta);
end
end
答案 1 :(得分:1)
您可以按如下所示更好地对其进行矢量化处理
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y);
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
theta=theta-(alpha/m)*((X*theta-y)'*X)';
J_history(iter) = computeCost(X, y, theta);
end;
end;
ComputeCost函数可以写为
function J = computeCost(X, y, theta)
m = length(y);
J = 1/(2*m)*sum((X*theta-y)^2);
end;
答案 2 :(得分:0)
function J = computeCost(X, y, theta)
m = length(y);
J = 1/(2*m)*sum((X*theta-y).^2);
end;