为sqp提供Hessian函数的八度误差

时间:2013-03-03 04:42:32

标签: matlab octave mathematical-optimization hessian-matrix

我正在尝试在八度音阶中解决以下优化问题

Formula

第一个禁令是A positive semi-definite。 S是一组数据点,使得if(xi,xj)在S中,则xi类似于xj,D是一组数据点,使得if(xi,xj)在D中,则xi和xj不相似。请注意,上面的公式是2个单独的和,第二个和不是嵌套的。假设xi和xj是长度为N的列向量。

因为这是非线性优化,我试图使用八度音阶nonlinear program solver, sqp. 问题是,如果我只是提供优化功能,在一些小玩具测试中,BFGS方法找到Hessian 失败。因此我尝试提供自己的Hessian函数,但现在出现了这个问题

error: __qp__: operator *: nonconformant arguments (op1 is 2x2, op2 is 3x1)
error: called from:
error:   /usr/share/octave/3.6.3/m/optimization/qp.m at line 393, column 26
error:   /usr/share/octave/3.6.3/m/optimization/sqp.m at line 414, column 32

当我对sqp进行以下调用时

[A, ~, Info] = sqp(initial_guess, {@toOpt, @CalculateGradient,@CalculateHessian},
[],[],0,[],maxiter);
通过仅求解对角线条目并将所有对角线条目约束为> = 0,我简化了A为正半正定和对角线的约束。 initial_guess是一个N长的向量。

这是我的代码来计算我认为的Hessian矩阵

%Hessian = CalculateHessian(A)
%calculates the Hessian of the function we are optimizing as follows
%H(i,j) = (sumsq(D(:,i),1) * sumsq(D(:,j),1)) / (sum(A.*sumsq(D,1))^2)
%where D is a matrix of of differences between observations that are dissimilar, with one difference on each row
%and sumsq is the sum of the squares
%input A: the current guess for A
%output Hessian: The hessian of the function we are optimizing
function Hessian = CalculateHessian(A)
    global HessianNumerator; %this is a matrix with the numerator of H(i,j)
    global Dsum_of_squares; %the sum of the squares of the differences of each dimensions of the dissimilar observations

    if(iscolumn(A)) %if A is a column vector
        A = A'; %make it a row vector. necessary to prevent broadcasting
    endif

    if(~isempty(Dsum_of_squares)) %if disimilar constraints were provided   
        Hessian = HessianNumerator / (sum(A.*Dsum_of_squares)^2)
    else
        Hessian = HessianNumerator; %the hessian is a matrix of 0s
    endif

endfunction

和Dsum_of_squares和HessianNumertor是

[dissimilarRow,dissimilarColumn] = find(D); %find which observations are dissimilar to each other
DissimilarDiffs = X(dissimilarRow,:) - X(dissimilarColumn,:); %take the difference between the dissimilar observations
Dsum_of_squares = sumsq(DissimilarDiffs,1);
HessianNumerator = Dsum_of_squares .* Dsum_of_squares'; %calculate the numerator of the Hessian. it is a constant value

X是M×N矩阵,每行一次观察。

D是M×M相异矩阵。如果D(i,j)为1,则X的行i与行j不同。否则为0。

我认为我的错误出现在以下某个方面(最不可能最有可能)

  1. 我以前用来推导Hessian函数的数学是错误的。我正在使用的公式是我对函数的评论。
  2. 我的数学实施。
  3. sqp想要的Hessian矩阵与Hessian Matrix维基百科页面上描述的不同。
  4. 非常感谢任何帮助。如果您需要我发布更多代码,我很乐意这样做。现在,尝试和解决优化的代码量大约是160行。

    以下是我正在运行的导致代码失败的测试用例。如果我只传递渐变函数,它就可以工作。

    X = [1 2 3; 
         4 5 6; 
         7 8 9; 
         10 11 12];
    S = [0 1 1 0; 
         1 0 0 0; 
         1 0 0 0; 
         0 0 0 0]; %this means row 1 of X is similar to rows 2 and 3
    D = [0 0 0 0; 
         0 0 0 0;
         0 0 0 1;
         0 0 1 0]; %this means row 3 of X is dissimilar to row 4
    gml(X,S,D, 200); %200 is the maximum number of iterations for sqp to run
    

0 个答案:

没有答案