使用Gradient Descent做线性SVM(支持向量机)时遇到了一些困难。
我正在使用的公式如下。
其中第一个等式是成本函数,第二个等式是每个特征的θ值。
c是拟合参数(正则化参数)
的控制alpha决定斜率收敛的速率。
不知何故当我为我的数据集运行上面的公式时,My j(theta)继续增加,它永远不会减少。我通过更改c和alpha的值尝试了所有可能的方案。
如果有任何错误可能在公式中,我会很高兴,如果有人可以指出它。
这是我正在使用的Octave代码:
clear all
clc
x=[3,1;2,2;1,2;1.5,3;4,1;4,2;4,3;4,5];
y=[1;1;1;1;0;0;0;0];
[m,n]=size(x);
x=[ones(m,1),x];
X=x;
hold off
%. In this step we will plot the graph for the given input data set just to see how is the distribution of the two class.
pos = find(y == 1); % This will take the postion or array number from y for all the class that has value 1
neg = find(y == 0); % Similarly this will take the position or array number from y for all class that has value 0
% Now we plot the graph column x1 Vs x2 for y=1 and y=0
hold on
plot(X(pos, 2), X(pos,3), '+');
plot(X(neg, 2), X(neg, 3), 'o');
axis([min(x(:,2))-2,max(x(:,2))+2, min(x(:,3))-2, max(x(:,3))+2])
xlabel('x1 marks in subject 1')
ylabel('y1 marks in subject 2')
legend('diagonal','failed', 'Pass')
hold off
% feature scaling
% Now we limit the x1 and x2 we need to leave or skip the first column x0 because they should stay as 1.
% If we dont do feature scalling then the decision line would be opposite.
%mn = mean(x);
%sd = std(x);
%x(:,2) = (x(:,2) - mn(2))./ sd(2);
%x(:,3) = (x(:,3) - mn(3))./ sd(3);
% Algorith for linear SVM
g=inline('1.0 ./ (1.0 + exp(-z))');
theta = zeros(size(x(1,:)))';
max_iter=100;
j_theta=zeros(max_iter,1); % j is a zero matrix that is used to store the theta cost function j(theta)
c=0.1;
alpha=0.1;
for num_iter =1:max_iter
z=x*theta;
h=g(z);
h
j_theta(num_iter)=c .* (-y'* log(h) - (1 - y)'*log(1-h)) + ((0.5) * (theta'*theta)); % the second term is regularization
%% the above equation computes the cost function
grad = (c^2) * x' * (h-y); %% computed the gradient descent
reg_exprson= alpha .* (0.5) * (theta'*theta); %% Computes the regularization term
theta=theta - (alpha.*grad) - reg_exprson ; %% Computes the new theta vector for each feature
theta
end
figure
plot(0:99, j_theta(1:100), 'b', 'LineWidth', 2)
由于