Matlab R2012b中矩阵乘法的精度问题

时间:2015-05-20 11:22:58

标签: matlab optimization matrix-multiplication floating-accuracy numerical

我已经实现了一个脚本,该脚本执行约束优化以解决支持向量机模型的最佳参数。我注意到我的脚本由于某种原因给出了不准确的结果(尽管非常接近实际值)。例如,典型情况是计算结果应该恰好为0,而是类似

-1/18014398509481984 = -5.551115123125783e-17

当我将矩阵与向量相乘时会发生这种情况。这也很奇怪的是,如果我在Matlab的命令窗口中手动进行乘法,我得到的结果为0。

让我举个例子:如果我使用向量Aq = [-1 -1 1 1]x = [12/65 28/65 32/65 8/65]',如果我在命令窗口中执行此操作,我会得到0的乘法结果,如下图所示:

enter image description here

如果另一方面我在我的函数脚本中执行此操作,我不会将结果设为0,而是将值-1/18014398509481984。

以下是我的脚本中负责此乘法的部分(我已将Aqx添加到脚本中以显示Aq和{的内容{1}}以及:

x

以下是运行时代码的结果:

enter image description here

正如你所看到的那样,即使它真的应该是正确的,它也不是0。请注意,disp('DOT PRODUCT OF ACTIVE SET AND NEW POINT: ') Aq x Aq*x Aq的所有可能值都不会出现此问题。如果xAq = [-1 -1 1 1]结果正好为0,如下所示:

enter image description here

造成这种不准确的原因是什么?我怎样才能解决这个问题?

P.S。我没有包含我的整个代码,因为它没有很好的记录和几百行,但我会按要求。

谢谢!

更新:使用Ander Biguri的建议进行新测试:

enter image description here

更新2:代码

x = [4/13 4/13 4/13 4/13]

注意:

如果运行示例1,则应从以下开始获得输出:

enter image description here

正如您所看到的,function [weights, alphas, iters] = solveSVM(data, labels, C, e) % FUNCTION [weights, alphas, iters] = solveSVM(data, labels, C, e) % % AUTHOR: jjepsuomi % % VERSION: 1.0 % % DESCRIPTION: % - This function will attempt to solve the optimal weights for a Support % Vector Machines (SVM) model using active set method with gradient % projection. % % INPUTS: % "data" a n-by-m data matrix. The number of rows 'n' corresponds to the % number of data points and the number of columns 'm' corresponds to the % number of variables. % "labels" a 1-by-n row vector of data labels from the set {-1,1}. % "C" Box costraint upper limit. This will constrain the values of 'alphas' % to the range 0 <= alphas <= C. If hard-margin SVM model is required set % C=Inf. % "e" a real value corresponding to the convergence criterion, that is if % solution Xi and Xi-1 are within distance 'e' from each other stop the % learning process, i.e. IF |F(Xi)-F(Xi-1)| < e ==> stop learning process. % % OUTPUTS: % "weights" a vector corresponding to the optimal decision line parameters. % "alphas" a vector of alpha-values corresponding to the optimal solution % of the dual optimization problem of SVM. % "iters" number of iterations until learning stopped. % % EXAMPLE USAGE 1: % % 'Hard-margin SVM': % % data = [0 0;2 2;2 0;3 0]; % labels = [-1 -1 1 1]; % [weights, alphas, iters] = solveSVM(data, labels, Inf, 10^-100) % % EXAMPLE USAGE 2: % % 'Soft-margin SVM': % % data = [0 0;2 2;2 0;3 0]; % labels = [-1 -1 1 1]; % [weights, alphas, iters] = solveSVM(data, labels, 0.8, 10^-100) % STEP 1: INITIALIZATION OF THE PROBLEM format long % Calculate linear kernel matrix L = kron(labels', labels); K = data*data'; % Hessian matrix Qd = L.*K; % The minimization function L = @(a) (1/2)*a'*Qd*a - ones(1, length(a))*a; % Gradient of the minimizable function gL = @(a) a'*Qd - ones(1, length(a)); % STEP 2: THE LEARNING PROCESS, ACTIVE SET WITH GRADIENT PROJECTION % Initial feasible solution (required by gradient projection) x = zeros(length(labels), 1); iters = 1; optfound = 0; while optfound == 0 % criterion met % Negative of the gradient at initial solution g = -gL(x); % Set the active set and projection matrix Aq = labels; % In plane y^Tx = 0 P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane projection % Values smaller than 'eps' are changed into 0 P(find(abs(P-0) < eps)) = 0; d = P*g'; % Projection onto plane if ~isempty(find(x==0 | x==C)) % Constraints active? acinds = find(x==0 | x==C); for i = 1:length(acinds) if (x(acinds(i)) == 0 && d(acinds(i)) < 0) || x(acinds(i)) == C && d(acinds(i)) > 0 % Make the constraint vector constr = zeros(1,length(x)); constr(acinds(i)) = 1; Aq = [Aq; constr]; end end % Update the projection matrix P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane / box projection % Values smaller than 'eps' are changed into 0 P(find(abs(P-0) < eps)) = 0; d = P*g'; % Projection onto plane / border end %%%% DISPLAY INFORMATION, THIS PART IS NOT NECESSAY, ONLY FOR DEBUGGING if Aq*x ~= 0 disp('ACTIVE SET CONSTRAINTS Aq :') Aq disp('CURRENT SOLUTION x :') x disp('MULTIPLICATION OF Aq and x') Aq*x end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Values smaller than 'eps' are changed into 0 d(find(abs(d-0) < eps)) = 0; if ~isempty(find(d~=0)) && rank(P) < length(x) % Line search for optimal lambda lopt = ((g*d)/(d'*Qd*d)); lmax = inf; for i = 1:length(x) if d(i) < 0 && -x(i) ~= 0 && -x(i)/d(i) <= lmax lmax = -x(i)/d(i); elseif d(i) > 0 && (C-x(i))/d(i) <= lmax lmax = (C-x(i))/d(i); end end lambda = max(0, min([lopt, lmax])); if abs(lambda) < eps lambda = 0; end xo = x; x = x + lambda*d; iters = iters + 1; end % Check whether search direction is 0-vector or 'e'-criterion met. if isempty(find(d~=0)) || abs(L(x)-L(xo)) < e optfound = 1; end end %%% STEP 3: GET THE WEIGHTS alphas = x; w = zeros(1, length(data(1,:))); for i = 1:size(data,1) w = w + labels(i)*alphas(i)*data(i,:); end svinds = find(alphas>0); svind = svinds(1); b = 1/labels(svind) - w*data(svind, :)'; %%% STEP 4: OPTIMALITY CHECK, KKT conditions. See KKT-conditions for reference. weights = [b; w']; datadim = length(data(1,:)); Q = [zeros(1,datadim+1); zeros(datadim, 1), eye(datadim)]; A = [ones(size(data,1), 1), data]; for i = 1:length(labels) A(i,:) = A(i,:)*labels(i); end LagDuG = Q*weights - A'*alphas; Ac = A*weights - ones(length(labels),1); alpA = alphas.*Ac; LagDuG(any(abs(LagDuG-0) < 10^-14)) = 0; if ~any(alphas < 0) && all(LagDuG == zeros(datadim+1,1)) && all(abs(Ac) >= 0) && all(abs(alpA) < 10^-6) disp('Optimal found, Karush-Kuhn-Tucker conditions satisfied.') else disp('Optimal not found, Karush-Kuhn-Tucker conditions not satisfied.') end % VISUALIZATION FOR 2D-CASE if size(data, 2) == 2 pinds = find(labels > 0); ninds = find(labels < 0); plot(data(pinds, 1), data(pinds, 2), 'o', 'MarkerFaceColor', 'red', 'MarkerEdgeColor', 'black') hold on plot(data(ninds, 1), data(ninds, 2), 'o', 'MarkerFaceColor', 'blue', 'MarkerEdgeColor', 'black') Xb = min(data(:,1))-1; Xe = max(data(:,1))+1; Yb = -(b+w(1)*Xb)/w(2); Ye = -(b+w(1)*Xe)/w(2); lineh = plot([Xb Xe], [Yb Ye], 'LineWidth', 2); supvh = plot(data(find(alphas~=0), 1), data(find(alphas~=0), 2), 'g.'); legend([lineh, supvh], 'Decision boundary', 'Support vectors'); hold off end Aq之间的乘法不会产生值0,即使它们应该也是如此。在这个特定的例子中这不是一件坏事,但如果我有更多的数据点,其中包含大量的小数,这种不准确性会变得越来越大,因为计算不是精确。例如,当我在梯度投影方法中朝着最优解移动时,我正在搜索新的方向向量时,这是不好的。搜索方向不是正确的方向,而是接近它。这就是为什么我想要完全正确的值...这可能吗?

我想知道数据点中的小数是否与我的结果的准确性有关。见下图:

enter image description here

所以问题是:这是由数据引起的还是在优化过程中出现了问题......

2 个答案:

答案 0 :(得分:3)

您是否在脚本中使用format函数?看起来你在format rat使用了某个地方。

您可以随时使用matlab eps函数,该函数返回在matlab中使用的精度。根据我的Matlab R2014B,-1 / 18014398509481984的绝对值小于此值:

format long
a = abs(-1/18014398509481984)
b = eps
a < b

这基本上意味着结果为零(但是matlab停止计算,因为根据eps值,结果很好)。

否则,您可以在计算之前在脚本中使用format long

修改

我在您的代码中看到inv函数,尝试将其替换为\运算符(mldivide)。它的结果将更准确,因为它使用高斯消元,而不形成逆。

inv文档声明:

  

在实践中,很少需要形成a的显式逆   矩阵。在解决系统时会出现频繁误用的inv   线性方程Ax = b。解决此问题的一种方法是使用x = inv(A)* b。一个   更好的方法,从执行时间和数字准确性   立场,是使用矩阵除法运算符x = A \ b。这个   使用高斯消元法产生解,而不形成   逆。

答案 1 :(得分:2)

使用提供的代码,这就是我测试的方式:

  1. 我在以下代码中添加了一个断点:

    if Aq*x ~= 0
        disp('ACTIVE SET CONSTRAINTS Aq :')
        Aq
        disp('CURRENT SOLUTION x :')
        x
        disp('MULTIPLICATION OF Aq and x')
        Aq*x
    end
  2. if分支被采用时,我输入了控制台:

    K>> format rat; disp(x);
              12/65
              28/65
              32/65
               8/65

    K>> disp(x == [12/65; 28/65; 32/65; 8/65]); 0 1 0 0

    K>> format('long'); disp(max(abs(x - [12/65; 28/65; 32/65; 8/65]))); 1.387778780781446e-17

    K>> disp(eps(8/65)); 1.387778780781446e-17

  3. 这表明这是一个显示问题:format rat故意使用小整数表示值,但代价是精度。显然,x(4)的真实值是下一个8/65,而不是double格式。

    所以,这就引出了一个问题:你确定数字收敛取决于翻转double精度值中的最低有效位吗?