Question

所以我写了下面的MATLAB代码作为梯度下降的练习。我显然选择了一个最小值为（0,0）的函数，但算法将我抛给（-3,3）。

尽管事实xGrad，yGrad是[xGrad,yGrad] = gradient(f);，按预期大约xGrad，yGrad。我想我在这里倒了一些东西，但是我一直在试图弄清楚它是什么，而我却没有得到它，所以我希望有人能注意到我的错误......

2*X

感谢任何帮助的人

编辑：纠正错别字并使代码更清晰。它仍然做同样的事情，并有同样的问题

Answer 1

meshgrid返回的X矩阵在列中的X 值增加，而不是行！例如[X, Y] = meshgrid(-1:1, 1:3)返回

     [-1  0  1;           [1  1  1;
X  =  -1  0  1;       Y =  2  2  2;
   =  -1  0  1];           3  3  3];

注意x-index应该如何放在X或Y列中，y-index应该放在行中。具体来说，你的行：

fGrad = [ xGrad(idx,idy) , yGrad(idx,idy) ]; %gradient's definition

应改为：

fGrad = [ xGrad(idy,idx) , yGrad(idy,idx) ]; %gradient's definition

idy变量应该为行编制索引，idx变量应该为列编制索引

Answer 2

最终我没有弄清楚上一个方法有什么问题，但是这里有一个渐变体面的替代脚本，我用它来做同样的问题：

syms x y
f = -20*(x/2-x^2-y^5)*exp(-x^2-y^2); %cost function
% f = x^2+y^2; %simple test function

g = gradient(f, [x, y]);
lr = .01; %learning rate
eps = 1e-10; %convergence threshold
tooMuch = 1e3; %iterations' limit
p = [1.5 -1]; %starting point
for i=1:tooMuch %prevents too many iterations
    pGrad = [subs(g(1),[x y],p(end,:)) subs(g(2),[x y],p(end,:))]; %computes gradient
    pTMP = p(end,:) - lr*pGrad; %gradient descent's core
    p = [p;double(pTMP)]; %adds the new point
    if sum( (p(end,:)-p(end-1,:)).^2 ) < eps %checks convergence
        break
    end
end
v = -3:.1:3; %desired axes
[X, Y] = meshgrid(v,v);
contour(v,v,subs(f,[x y],{X,Y})) %draws the contour lines 
hold on
quiver(v,v,subs(g(1), [x y], {X,Y}),subs(g(2), [x y], {X,Y})) %draws the gradient directions 
plot(p(:,1),p(:,2)) %draws the route
hold off
suptitle(['gradient descent route from ',mat2str(round(p(1,:),3)),' with \eta=',num2str(lr)])
if i<tooMuch
    title(['converged to ',mat2str(round(p(end,:),3)),' after ',mat2str(i),' steps'])
else
    title(['stopped at ',mat2str(round(p(end,:),3)),' without converging'])
end

只是一些结果

在后一种情况下，你可以看到它没有收敛，但它不是梯度下降的问题，只是学习率设置得太高（所以它反复错过最小点）。

欢迎使用它。

梯度下降MATLAB脚本

2 个答案: