我已经实现了一个脚本,该脚本执行约束优化以解决支持向量机模型的最佳参数。我注意到我的脚本由于某种原因给出了不准确的结果(尽管非常接近实际值)。例如,典型情况是计算结果应该恰好为0,而是类似
-1/18014398509481984 = -5.551115123125783e-17
当我将矩阵与向量相乘时会发生这种情况。这也很奇怪的是,如果我在Matlab的命令窗口中手动进行乘法,我得到的结果为0。
让我举个例子:如果我使用向量Aq = [-1 -1 1 1]
和x = [12/65 28/65 32/65 8/65]'
,如果我在命令窗口中执行此操作,我会得到0的乘法结果,如下图所示:
如果另一方面我在我的函数脚本中执行此操作,我不会将结果设为0,而是将值-1/18014398509481984。
以下是我的脚本中负责此乘法的部分(我已将Aq
和x
添加到脚本中以显示Aq
和{的内容{1}}以及:
x
以下是运行时代码的结果:
正如你所看到的那样,即使它真的应该是正确的,它也不是0。请注意,disp('DOT PRODUCT OF ACTIVE SET AND NEW POINT: ')
Aq
x
Aq*x
和Aq
的所有可能值都不会出现此问题。如果x
和Aq = [-1 -1 1 1]
结果正好为0,如下所示:
造成这种不准确的原因是什么?我怎样才能解决这个问题?
P.S。我没有包含我的整个代码,因为它没有很好的记录和几百行,但我会按要求。
谢谢!
更新:使用Ander Biguri的建议进行新测试:
更新2:代码
x = [4/13 4/13 4/13 4/13]
注意:
如果运行示例1,则应从以下开始获得输出:
正如您所看到的,function [weights, alphas, iters] = solveSVM(data, labels, C, e)
% FUNCTION [weights, alphas, iters] = solveSVM(data, labels, C, e)
%
% AUTHOR: jjepsuomi
%
% VERSION: 1.0
%
% DESCRIPTION:
% - This function will attempt to solve the optimal weights for a Support
% Vector Machines (SVM) model using active set method with gradient
% projection.
%
% INPUTS:
% "data" a n-by-m data matrix. The number of rows 'n' corresponds to the
% number of data points and the number of columns 'm' corresponds to the
% number of variables.
% "labels" a 1-by-n row vector of data labels from the set {-1,1}.
% "C" Box costraint upper limit. This will constrain the values of 'alphas'
% to the range 0 <= alphas <= C. If hard-margin SVM model is required set
% C=Inf.
% "e" a real value corresponding to the convergence criterion, that is if
% solution Xi and Xi-1 are within distance 'e' from each other stop the
% learning process, i.e. IF |F(Xi)-F(Xi-1)| < e ==> stop learning process.
%
% OUTPUTS:
% "weights" a vector corresponding to the optimal decision line parameters.
% "alphas" a vector of alpha-values corresponding to the optimal solution
% of the dual optimization problem of SVM.
% "iters" number of iterations until learning stopped.
%
% EXAMPLE USAGE 1:
%
% 'Hard-margin SVM':
%
% data = [0 0;2 2;2 0;3 0];
% labels = [-1 -1 1 1];
% [weights, alphas, iters] = solveSVM(data, labels, Inf, 10^-100)
%
% EXAMPLE USAGE 2:
%
% 'Soft-margin SVM':
%
% data = [0 0;2 2;2 0;3 0];
% labels = [-1 -1 1 1];
% [weights, alphas, iters] = solveSVM(data, labels, 0.8, 10^-100)
% STEP 1: INITIALIZATION OF THE PROBLEM
format long
% Calculate linear kernel matrix
L = kron(labels', labels);
K = data*data';
% Hessian matrix
Qd = L.*K;
% The minimization function
L = @(a) (1/2)*a'*Qd*a - ones(1, length(a))*a;
% Gradient of the minimizable function
gL = @(a) a'*Qd - ones(1, length(a));
% STEP 2: THE LEARNING PROCESS, ACTIVE SET WITH GRADIENT PROJECTION
% Initial feasible solution (required by gradient projection)
x = zeros(length(labels), 1);
iters = 1;
optfound = 0;
while optfound == 0 % criterion met
% Negative of the gradient at initial solution
g = -gL(x);
% Set the active set and projection matrix
Aq = labels; % In plane y^Tx = 0
P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane projection
% Values smaller than 'eps' are changed into 0
P(find(abs(P-0) < eps)) = 0;
d = P*g'; % Projection onto plane
if ~isempty(find(x==0 | x==C)) % Constraints active?
acinds = find(x==0 | x==C);
for i = 1:length(acinds)
if (x(acinds(i)) == 0 && d(acinds(i)) < 0) || x(acinds(i)) == C && d(acinds(i)) > 0
% Make the constraint vector
constr = zeros(1,length(x));
constr(acinds(i)) = 1;
Aq = [Aq; constr];
end
end
% Update the projection matrix
P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane / box projection
% Values smaller than 'eps' are changed into 0
P(find(abs(P-0) < eps)) = 0;
d = P*g'; % Projection onto plane / border
end
%%%% DISPLAY INFORMATION, THIS PART IS NOT NECESSAY, ONLY FOR DEBUGGING
if Aq*x ~= 0
disp('ACTIVE SET CONSTRAINTS Aq :')
Aq
disp('CURRENT SOLUTION x :')
x
disp('MULTIPLICATION OF Aq and x')
Aq*x
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Values smaller than 'eps' are changed into 0
d(find(abs(d-0) < eps)) = 0;
if ~isempty(find(d~=0)) && rank(P) < length(x) % Line search for optimal lambda
lopt = ((g*d)/(d'*Qd*d));
lmax = inf;
for i = 1:length(x)
if d(i) < 0 && -x(i) ~= 0 && -x(i)/d(i) <= lmax
lmax = -x(i)/d(i);
elseif d(i) > 0 && (C-x(i))/d(i) <= lmax
lmax = (C-x(i))/d(i);
end
end
lambda = max(0, min([lopt, lmax]));
if abs(lambda) < eps
lambda = 0;
end
xo = x;
x = x + lambda*d;
iters = iters + 1;
end
% Check whether search direction is 0-vector or 'e'-criterion met.
if isempty(find(d~=0)) || abs(L(x)-L(xo)) < e
optfound = 1;
end
end
%%% STEP 3: GET THE WEIGHTS
alphas = x;
w = zeros(1, length(data(1,:)));
for i = 1:size(data,1)
w = w + labels(i)*alphas(i)*data(i,:);
end
svinds = find(alphas>0);
svind = svinds(1);
b = 1/labels(svind) - w*data(svind, :)';
%%% STEP 4: OPTIMALITY CHECK, KKT conditions. See KKT-conditions for reference.
weights = [b; w'];
datadim = length(data(1,:));
Q = [zeros(1,datadim+1); zeros(datadim, 1), eye(datadim)];
A = [ones(size(data,1), 1), data];
for i = 1:length(labels)
A(i,:) = A(i,:)*labels(i);
end
LagDuG = Q*weights - A'*alphas;
Ac = A*weights - ones(length(labels),1);
alpA = alphas.*Ac;
LagDuG(any(abs(LagDuG-0) < 10^-14)) = 0;
if ~any(alphas < 0) && all(LagDuG == zeros(datadim+1,1)) && all(abs(Ac) >= 0) && all(abs(alpA) < 10^-6)
disp('Optimal found, Karush-Kuhn-Tucker conditions satisfied.')
else
disp('Optimal not found, Karush-Kuhn-Tucker conditions not satisfied.')
end
% VISUALIZATION FOR 2D-CASE
if size(data, 2) == 2
pinds = find(labels > 0);
ninds = find(labels < 0);
plot(data(pinds, 1), data(pinds, 2), 'o', 'MarkerFaceColor', 'red', 'MarkerEdgeColor', 'black')
hold on
plot(data(ninds, 1), data(ninds, 2), 'o', 'MarkerFaceColor', 'blue', 'MarkerEdgeColor', 'black')
Xb = min(data(:,1))-1;
Xe = max(data(:,1))+1;
Yb = -(b+w(1)*Xb)/w(2);
Ye = -(b+w(1)*Xe)/w(2);
lineh = plot([Xb Xe], [Yb Ye], 'LineWidth', 2);
supvh = plot(data(find(alphas~=0), 1), data(find(alphas~=0), 2), 'g.');
legend([lineh, supvh], 'Decision boundary', 'Support vectors');
hold off
end
和Aq
之间的乘法不会产生值0,即使它们应该也是如此。在这个特定的例子中这不是一件坏事,但如果我有更多的数据点,其中包含大量的小数,这种不准确性会变得越来越大,因为计算不是精确。例如,当我在梯度投影方法中朝着最优解移动时,我正在搜索新的方向向量时,这是不好的。搜索方向不是正确的方向,而是接近它。这就是为什么我想要完全正确的值...这可能吗?
我想知道数据点中的小数是否与我的结果的准确性有关。见下图:
所以问题是:这是由数据引起的还是在优化过程中出现了问题......
答案 0 :(得分:3)
您是否在脚本中使用format
函数?看起来你在format rat
使用了某个地方。
您可以随时使用matlab eps
函数,该函数返回在matlab中使用的精度。根据我的Matlab R2014B,-1 / 18014398509481984的绝对值小于此值:
format long
a = abs(-1/18014398509481984)
b = eps
a < b
这基本上意味着结果为零(但是matlab停止计算,因为根据eps
值,结果很好)。
否则,您可以在计算之前在脚本中使用format long
。
修改强>
我在您的代码中看到inv
函数,尝试将其替换为\
运算符(mldivide
)。它的结果将更准确,因为它使用高斯消元,而不形成逆。
inv
文档声明:
在实践中,很少需要形成a的显式逆 矩阵。在解决系统时会出现频繁误用的inv 线性方程Ax = b。解决此问题的一种方法是使用x = inv(A)* b。一个 更好的方法,从执行时间和数字准确性 立场,是使用矩阵除法运算符x = A \ b。这个 使用高斯消元法产生解,而不形成 逆。
答案 1 :(得分:2)
使用提供的代码,这就是我测试的方式:
我在以下代码中添加了一个断点:
if Aq*x ~= 0 disp('ACTIVE SET CONSTRAINTS Aq :') Aq disp('CURRENT SOLUTION x :') x disp('MULTIPLICATION OF Aq and x') Aq*x end
当if
分支被采用时,我输入了控制台:
K>> format rat; disp(x); 12/65 28/65 32/65 8/65K>> disp(x == [12/65; 28/65; 32/65; 8/65]); 0 1 0 0
K>> format('long'); disp(max(abs(x - [12/65; 28/65; 32/65; 8/65]))); 1.387778780781446e-17
K>> disp(eps(8/65)); 1.387778780781446e-17
这表明这是一个显示问题:format rat
故意使用小整数表示值,但代价是精度。显然,x(4)的真实值是下一个8/65,而不是double
格式。
所以,这就引出了一个问题:你确定数字收敛取决于翻转double
精度值中的最低有效位吗?