我正在尝试在八度音阶中解决以下优化问题
第一个禁令是A positive semi-definite。 S是一组数据点,使得if(xi,xj)在S中,则xi类似于xj,D是一组数据点,使得if(xi,xj)在D中,则xi和xj不相似。请注意,上面的公式是2个单独的和,第二个和不是嵌套的。假设xi和xj是长度为N的列向量。
因为这是非线性优化,我试图使用八度音阶nonlinear program solver, sqp. 问题是,如果我只是提供优化功能,在一些小玩具测试中,BFGS方法找到Hessian 失败。因此我尝试提供自己的Hessian函数,但现在出现了这个问题
error: __qp__: operator *: nonconformant arguments (op1 is 2x2, op2 is 3x1)
error: called from:
error: /usr/share/octave/3.6.3/m/optimization/qp.m at line 393, column 26
error: /usr/share/octave/3.6.3/m/optimization/sqp.m at line 414, column 32
当我对sqp进行以下调用时
[A, ~, Info] = sqp(initial_guess, {@toOpt, @CalculateGradient,@CalculateHessian},
[],[],0,[],maxiter);
通过仅求解对角线条目并将所有对角线条目约束为> = 0,我简化了A为正半正定和对角线的约束。 initial_guess是一个N长的向量。
这是我的代码来计算我认为的Hessian矩阵
%Hessian = CalculateHessian(A)
%calculates the Hessian of the function we are optimizing as follows
%H(i,j) = (sumsq(D(:,i),1) * sumsq(D(:,j),1)) / (sum(A.*sumsq(D,1))^2)
%where D is a matrix of of differences between observations that are dissimilar, with one difference on each row
%and sumsq is the sum of the squares
%input A: the current guess for A
%output Hessian: The hessian of the function we are optimizing
function Hessian = CalculateHessian(A)
global HessianNumerator; %this is a matrix with the numerator of H(i,j)
global Dsum_of_squares; %the sum of the squares of the differences of each dimensions of the dissimilar observations
if(iscolumn(A)) %if A is a column vector
A = A'; %make it a row vector. necessary to prevent broadcasting
endif
if(~isempty(Dsum_of_squares)) %if disimilar constraints were provided
Hessian = HessianNumerator / (sum(A.*Dsum_of_squares)^2)
else
Hessian = HessianNumerator; %the hessian is a matrix of 0s
endif
endfunction
和Dsum_of_squares和HessianNumertor是
[dissimilarRow,dissimilarColumn] = find(D); %find which observations are dissimilar to each other
DissimilarDiffs = X(dissimilarRow,:) - X(dissimilarColumn,:); %take the difference between the dissimilar observations
Dsum_of_squares = sumsq(DissimilarDiffs,1);
HessianNumerator = Dsum_of_squares .* Dsum_of_squares'; %calculate the numerator of the Hessian. it is a constant value
X是M×N矩阵,每行一次观察。
D是M×M相异矩阵。如果D(i,j)为1,则X的行i与行j不同。否则为0。
我认为我的错误出现在以下某个方面(最不可能最有可能)
非常感谢任何帮助。如果您需要我发布更多代码,我很乐意这样做。现在,尝试和解决优化的代码量大约是160行。
以下是我正在运行的导致代码失败的测试用例。如果我只传递渐变函数,它就可以工作。
X = [1 2 3;
4 5 6;
7 8 9;
10 11 12];
S = [0 1 1 0;
1 0 0 0;
1 0 0 0;
0 0 0 0]; %this means row 1 of X is similar to rows 2 and 3
D = [0 0 0 0;
0 0 0 0;
0 0 0 1;
0 0 1 0]; %this means row 3 of X is dissimilar to row 4
gml(X,S,D, 200); %200 is the maximum number of iterations for sqp to run