Question

我已经实现了一个脚本，该脚本执行约束优化以解决支持向量机模型的最佳参数。我注意到我的脚本由于某种原因给出了不准确的结果（尽管非常接近实际值）。例如，典型情况是计算结果应该恰好为0，而是类似

-1/18014398509481984 = -5.551115123125783e-17

当我将矩阵与向量相乘时会发生这种情况。这也很奇怪的是，如果我在Matlab的命令窗口中手动进行乘法，我得到的结果为0。

让我举个例子：如果我使用向量Aq = [-1 -1 1 1]和x = [12/65 28/65 32/65 8/65]'，如果我在命令窗口中执行此操作，我会得到0的乘法结果，如下图所示：

enter image description here

如果另一方面我在我的函数脚本中执行此操作，我不会将结果设为0，而是将值-1/18014398509481984。

以下是我的脚本中负责此乘法的部分（我已将Aq和x添加到脚本中以显示Aq和{的内容{1}}以及：

以下是运行时代码的结果：

enter image description here

正如你所看到的那样，即使它真的应该是正确的，它也不是0。请注意，disp('DOT PRODUCT OF ACTIVE SET AND NEW POINT: ') Aq x Aq*x和Aq的所有可能值都不会出现此问题。如果x和Aq = [-1 -1 1 1]结果正好为0，如下所示：

enter image description here

造成这种不准确的原因是什么？我怎样才能解决这个问题？

P.S。我没有包含我的整个代码，因为它没有很好的记录和几百行，但我会按要求。

谢谢！

更新：使用Ander Biguri的建议进行新测试：

enter image description here

更新2：代码

x = [4/13 4/13 4/13 4/13]

注意：

如果运行示例1，则应从以下开始获得输出：

enter image description here

正如您所看到的，function [weights, alphas, iters] = solveSVM(data, labels, C, e) % FUNCTION [weights, alphas, iters] = solveSVM(data, labels, C, e) % % AUTHOR: jjepsuomi % % VERSION: 1.0 % % DESCRIPTION: % - This function will attempt to solve the optimal weights for a Support % Vector Machines (SVM) model using active set method with gradient % projection. % % INPUTS: % "data" a n-by-m data matrix. The number of rows 'n' corresponds to the % number of data points and the number of columns 'm' corresponds to the % number of variables. % "labels" a 1-by-n row vector of data labels from the set {-1,1}. % "C" Box costraint upper limit. This will constrain the values of 'alphas' % to the range 0 <= alphas <= C. If hard-margin SVM model is required set % C=Inf. % "e" a real value corresponding to the convergence criterion, that is if % solution Xi and Xi-1 are within distance 'e' from each other stop the % learning process, i.e. IF |F(Xi)-F(Xi-1)| < e ==> stop learning process. % % OUTPUTS: % "weights" a vector corresponding to the optimal decision line parameters. % "alphas" a vector of alpha-values corresponding to the optimal solution % of the dual optimization problem of SVM. % "iters" number of iterations until learning stopped. % % EXAMPLE USAGE 1: % % 'Hard-margin SVM': % % data = [0 0;2 2;2 0;3 0]; % labels = [-1 -1 1 1]; % [weights, alphas, iters] = solveSVM(data, labels, Inf, 10^-100) % % EXAMPLE USAGE 2: % % 'Soft-margin SVM': % % data = [0 0;2 2;2 0;3 0]; % labels = [-1 -1 1 1]; % [weights, alphas, iters] = solveSVM(data, labels, 0.8, 10^-100) % STEP 1: INITIALIZATION OF THE PROBLEM format long % Calculate linear kernel matrix L = kron(labels', labels); K = data*data'; % Hessian matrix Qd = L.*K; % The minimization function L = @(a) (1/2)*a'*Qd*a - ones(1, length(a))*a; % Gradient of the minimizable function gL = @(a) a'*Qd - ones(1, length(a)); % STEP 2: THE LEARNING PROCESS, ACTIVE SET WITH GRADIENT PROJECTION % Initial feasible solution (required by gradient projection) x = zeros(length(labels), 1); iters = 1; optfound = 0; while optfound == 0 % criterion met % Negative of the gradient at initial solution g = -gL(x); % Set the active set and projection matrix Aq = labels; % In plane y^Tx = 0 P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane projection % Values smaller than 'eps' are changed into 0 P(find(abs(P-0) < eps)) = 0; d = P*g'; % Projection onto plane if ~isempty(find(x==0 | x==C)) % Constraints active? acinds = find(x==0 | x==C); for i = 1:length(acinds) if (x(acinds(i)) == 0 && d(acinds(i)) < 0) || x(acinds(i)) == C && d(acinds(i)) > 0 % Make the constraint vector constr = zeros(1,length(x)); constr(acinds(i)) = 1; Aq = [Aq; constr]; end end % Update the projection matrix P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane / box projection % Values smaller than 'eps' are changed into 0 P(find(abs(P-0) < eps)) = 0; d = P*g'; % Projection onto plane / border end %%%% DISPLAY INFORMATION, THIS PART IS NOT NECESSAY, ONLY FOR DEBUGGING if Aq*x ~= 0 disp('ACTIVE SET CONSTRAINTS Aq :') Aq disp('CURRENT SOLUTION x :') x disp('MULTIPLICATION OF Aq and x') Aq*x end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Values smaller than 'eps' are changed into 0 d(find(abs(d-0) < eps)) = 0; if ~isempty(find(d~=0)) && rank(P) < length(x) % Line search for optimal lambda lopt = ((g*d)/(d'*Qd*d)); lmax = inf; for i = 1:length(x) if d(i) < 0 && -x(i) ~= 0 && -x(i)/d(i) <= lmax lmax = -x(i)/d(i); elseif d(i) > 0 && (C-x(i))/d(i) <= lmax lmax = (C-x(i))/d(i); end end lambda = max(0, min([lopt, lmax])); if abs(lambda) < eps lambda = 0; end xo = x; x = x + lambda*d; iters = iters + 1; end % Check whether search direction is 0-vector or 'e'-criterion met. if isempty(find(d~=0)) || abs(L(x)-L(xo)) < e optfound = 1; end end %%% STEP 3: GET THE WEIGHTS alphas = x; w = zeros(1, length(data(1,:))); for i = 1:size(data,1) w = w + labels(i)*alphas(i)*data(i,:); end svinds = find(alphas>0); svind = svinds(1); b = 1/labels(svind) - w*data(svind, :)'; %%% STEP 4: OPTIMALITY CHECK, KKT conditions. See KKT-conditions for reference. weights = [b; w']; datadim = length(data(1,:)); Q = [zeros(1,datadim+1); zeros(datadim, 1), eye(datadim)]; A = [ones(size(data,1), 1), data]; for i = 1:length(labels) A(i,:) = A(i,:)*labels(i); end LagDuG = Q*weights - A'*alphas; Ac = A*weights - ones(length(labels),1); alpA = alphas.*Ac; LagDuG(any(abs(LagDuG-0) < 10^-14)) = 0; if ~any(alphas < 0) && all(LagDuG == zeros(datadim+1,1)) && all(abs(Ac) >= 0) && all(abs(alpA) < 10^-6) disp('Optimal found, Karush-Kuhn-Tucker conditions satisfied.') else disp('Optimal not found, Karush-Kuhn-Tucker conditions not satisfied.') end % VISUALIZATION FOR 2D-CASE if size(data, 2) == 2 pinds = find(labels > 0); ninds = find(labels < 0); plot(data(pinds, 1), data(pinds, 2), 'o', 'MarkerFaceColor', 'red', 'MarkerEdgeColor', 'black') hold on plot(data(ninds, 1), data(ninds, 2), 'o', 'MarkerFaceColor', 'blue', 'MarkerEdgeColor', 'black') Xb = min(data(:,1))-1; Xe = max(data(:,1))+1; Yb = -(b+w(1)*Xb)/w(2); Ye = -(b+w(1)*Xe)/w(2); lineh = plot([Xb Xe], [Yb Ye], 'LineWidth', 2); supvh = plot(data(find(alphas~=0), 1), data(find(alphas~=0), 2), 'g.'); legend([lineh, supvh], 'Decision boundary', 'Support vectors'); hold off end和Aq之间的乘法不会产生值0，即使它们应该也是如此。在这个特定的例子中这不是一件坏事，但如果我有更多的数据点，其中包含大量的小数，这种不准确性会变得越来越大，因为计算不是精确。例如，当我在梯度投影方法中朝着最优解移动时，我正在搜索新的方向向量时，这是不好的。搜索方向不是正确的方向，而是接近它。这就是为什么我想要完全正确的值...这可能吗？

我想知道数据点中的小数是否与我的结果的准确性有关。见下图：

enter image description here

所以问题是：这是由数据引起的还是在优化过程中出现了问题......

Answer 1

您是否在脚本中使用format函数？看起来你在format rat使用了某个地方。

您可以随时使用matlab eps函数，该函数返回在matlab中使用的精度。根据我的Matlab R2014B，-1 / 18014398509481984的绝对值小于此值：

format long
a = abs(-1/18014398509481984)
b = eps
a < b

这基本上意味着结果为零（但是matlab停止计算，因为根据eps值，结果很好）。

否则，您可以在计算之前在脚本中使用format long。

修改

我在您的代码中看到inv函数，尝试将其替换为\运算符（mldivide）。它的结果将更准确，因为它使用高斯消元，而不形成逆。

inv文档声明：

在实践中，很少需要形成a的显式逆矩阵。在解决系统时会出现频繁误用的inv 线性方程Ax = b。解决此问题的一种方法是使用x = inv（A）* b。一个更好的方法，从执行时间和数字准确性立场，是使用矩阵除法运算符x = A \ b。这个使用高斯消元法产生解，而不形成逆。

Answer 2

使用提供的代码，这就是我测试的方式：

我在以下代码中添加了一个断点：

if Aq*x ~= 0
    disp('ACTIVE SET CONSTRAINTS Aq :')
    Aq
    disp('CURRENT SOLUTION x :')
    x
    disp('MULTIPLICATION OF Aq and x')
    Aq*x
end

当if分支被采用时，我输入了控制台：

K>> format rat; disp(x);
          12/65
          28/65
          32/65
           8/65

K>> disp(x == [12/65; 28/65; 32/65; 8/65]);
          0
          1
          0
          0

K>>  format('long'); disp(max(abs(x - [12/65; 28/65; 32/65; 8/65])));
          1.387778780781446e-17

K>>  disp(eps(8/65));
          1.387778780781446e-17

这表明这是一个显示问题：format rat故意使用小整数表示值，但代价是精度。显然，x（4）的真实值是下一个8/65，而不是double格式。

所以，这就引出了一个问题：你确定数字收敛取决于翻转double精度值中的最低有效位吗？

Matlab R2012b中矩阵乘法的精度问题

2 个答案: