Question

我已经使用矢量化实现了以下梯度下降代码，但似乎成本函数没有正确递减。相反，成本函数随着每次迭代而增加。

假设θ为n + 1向量，y为m向量，X为设计矩阵m *（n + 1）

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

m = length(y); % number of training examples
n = length(theta); % number of features
J_history = zeros(num_iters, 1);
error = ((theta' * X')' - y)*(alpha/m);
descent = zeros(size(theta),1);

for iter = 1:num_iters
for i = 1:n
   descent(i) = descent(i) + sum(error.* X(:,i));
   i = i + 1;
end 

theta = theta - descent;
J_history(iter) = computeCost(X, y, theta);
disp("the value of cost function is : "), disp(J_history(iter));
iter = iter + 1;
end

计算成本函数是：

function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
   H = theta' * X(i,:)';
   E = H - y(i);
   SQE = E^2;
   J = (J + SQE);
   i = i+1;
end;
J = J / (2*m);

Answer 1

您可以进一步向量化：

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    m = length(y); 
    J_history = zeros(num_iters, 1);

    for iter = 1:num_iters

       delta = (theta' * X'-y')*X;
       theta = theta - alpha/m*delta';
       J_history(iter) = computeCost(X, y, theta);

    end

end

Answer 2

您可以按如下所示更好地对其进行矢量化处理

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
  m = length(y);
  J_history = zeros(num_iters, 1);

  for iter = 1:num_iters

     theta=theta-(alpha/m)*((X*theta-y)'*X)';
     J_history(iter) = computeCost(X, y, theta);

  end;
end;

ComputeCost函数可以写为

function J = computeCost(X, y, theta)
  m = length(y); 

  J = 1/(2*m)*sum((X*theta-y)^2);

end;

Answer 3

function J = computeCost(X, y, theta)
  m = length(y); 

  J = 1/(2*m)*sum((X*theta-y).^2);

end;

使用矢量化的梯度下降的八度代码不能正确更新成本函数

3 个答案: