Question

编辑：我粘贴的代码太长了。 Basicaly我不知道如何使用第二个代码，如果我知道如何从第二个代码计算alpha我认为我的问题将得到解决。我为第二个代码尝试了很多输入参数，但它不起作用！

我编写了以下代码来使用Gradient descend方法解决凸优化问题：

function [optimumX,optimumF,counter,gNorm,dx] = grad_descent()

x0 = [3 3]';%'//
terminationThreshold = 1e-6;
maxIterations = 100;
dxMin = 1e-6;

gNorm = inf; x = x0; counter = 0; dx = inf;

% ************************************
f = @(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
%alpha = 0.01;
% ************************************

figure(1); clf; ezcontour(f,[-5 5 -5 5]); axis equal; hold on

f2 = @(x) f(x(1),x(2));

% gradient descent algorithm:
while and(gNorm >= terminationThreshold, and(counter <= maxIterations, dx >= dxMin))
    g = grad(x);
    gNorm = norm(g);

    alpha = linesearch_strongwolfe(f,-g, x0, 1);

    xNew = x - alpha * g;
    % check step
    if ~isfinite(xNew)
        display(['Number of iterations: ' num2str(counter)])
        error('x is inf or NaN')
    end
    % **************************************
    plot([x(1) xNew(1)],[x(2) xNew(2)],'ko-')
    refresh
    % **************************************

    counter = counter + 1;
    dx = norm(xNew-x);
    x = xNew;
end
optimumX = x;
optimumF = f2(optimumX);
counter = counter - 1;

% define the gradient of the objective
function g = grad(x)
g = [(8*x(1) + 2*x(2) +10)
    (2*x(1) + 16*x(2) + 1)];
end

end

如您所见，我已评论alpha = 0.01;部分。我想通过其他代码计算 alpha 。这是代码（这段代码不是我的）

function alphas = linesearch_strongwolfe(f,d,x0,alpham)

alpha0 = 0;
alphap = alpha0;
c1 = 1e-4;
c2 = 0.5;
alphax = alpham*rand(1);
[fx0,gx0] = feval(f,x0,d);
fxp = fx0;
gxp = gx0;
i=1;

while (1 ~= 2)
  xx = x0 + alphax*d;
  [fxx,gxx] = feval(f,xx,d);
  if (fxx > fx0 + c1*alphax*gx0) | ((i > 1) & (fxx >= fxp)),
    alphas = zoom(f,x0,d,alphap,alphax);
    return;
  end
  if abs(gxx) <= -c2*gx0,
    alphas = alphax;
    return;
  end
  if gxx >= 0,
    alphas = zoom(f,x0,d,alphax,alphap);
    return;
  end
  alphap = alphax;
  fxp = fxx;
  gxp = gxx;
  alphax = alphax + (alpham-alphax)*rand(1);
  i = i+1;
end

function alphas = zoom(f,x0,d,alphal,alphah)
c1 = 1e-4;
c2 = 0.5;
[fx0,gx0] = feval(f,x0,d);

while (1~=2),
   alphax = 1/2*(alphal+alphah);
   xx = x0 + alphax*d;
   [fxx,gxx] = feval(f,xx,d);
   xl = x0 + alphal*d;
   fxl = feval(f,xl,d);
   if ((fxx > fx0 + c1*alphax*gx0) | (fxx >= fxl)),
      alphah = alphax;
   else
      if abs(gxx) <= -c2*gx0,
        alphas = alphax;
        return;
      end
      if gxx*(alphah-alphal) >= 0,
        alphah = alphal;
      end
      alphal = alphax;
   end
end

但是我收到了这个错误：

linesearch_strongwolfe中的错误（第11行）[fx0，gx0] = feval（f，x0，d）;

正如您所看到的，我手动编写了 f 函数及其渐变。 linesearch_strongwolfe（f，d，x0，alpham）采用函数 f ，f的渐变，向量 x0 和常量 alpham 。我的f声明有什么问题吗？如果我放回alpha = 0.01;

，此代码可以正常工作

Answer 1

我认为：

x0 = [3; 3]; %2-element column vector
g = grad(x0); %2-element column vector
f = @(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
linesearch_strongwolfe(f,-g, x0, 1); %passing variables

在函数内部：

[fx0,gx0] = feval(f,x0,-g); %variable names substituted with input vars

这实际上会调用

[fx0,gx0] = f(x0,-g);

但是f(x0,-g) 是带有这些输入的单个2元素列向量。将输出分配给两个变量将不起作用。

您必须将f定义为正确的命名函数（就像grad）一样输出2个变量（每个组件一个），或者编辑linesearch_strongwolfe的代码以返回一个变量，然后自己将其分成2个独立的变量。

如果您遇到非常罕见的类型的懒惰并且不想定义命名函数，您仍然可以使用匿名函数，代价是复制两个组件的代码（至少我无法想出一个更清洁的解决方案）：

f = @(x1,x2) deal(4.*x1(1)^2 + 2.*x1(1)*x2(1) +8.*x2(1)^2 + 10.*x1(1) + x2(1),...
                  4.*x1(2)^2 + 2.*x1(2)*x2(2) +8.*x2(2)^2 + 10.*x1(2) + x2(2));

[fx0,gx0] = f(x0,-g); %now works fine

只要你总是有2个输出变量。请注意，这更像是一个概念证明，因为它很丑陋，效率低下，而且很容易受到错别字的影响。

评估函数时出错

1 个答案: