我正在尝试实现Logistic回归算法,而不调用matlab支持的任何函数,之后我调用matlab函数进行逻辑回归mnrfit
,这样我就可以交叉确认我的算法运行良好。
我正在实施的流程如下。我首先制作一个具有输入数据的向量x和一个向量y [0,1],它对每个数据x都有相应的类。我使用梯度下降对这些数据实现线性回归,一旦我提取系数,我通过sigmoid函数传递线。稍后我会对x = 10进行预测,以找出此输入的第1类的可能性。很简单......
之后我调用matlab函数mnrfit
并提取逻辑回归的系数。为了进行相同的预测,我使用参数10调用函数mnrval
,因为我想像以前一样预测输入x = 10。我的结果不同,我不知道为什么......
最后显示了提取的2个图,显示了每种情况下的概率密度函数。
我还附上了实施的代码。
% x is the continues input and y is the category of every output [1 or 0]
x = (1:100)'; % independent variables x(s)
y(1:10) = 0; % Dependent variables y(s) -- class 0
y(11:100) = 1; % Dependent variables y(s) -- class 1
y=y';
y = y(randperm(length(y))); % Random order of y array
x=[ones(length(x),1) x]; % This is done for vectorized code
%% Initialize Linear regression parameters
m = length(y); % number of training examples
% initialize fitting parameters - all zeros
Alpha = 0; % gradient
Beta = 0; % offset
% Some gradient descent settings
% iterations must be a big number because we are taking very small steps .
iterations = 100000;
% Learning step must be small because the line must fit the data between
% [0 and 1]
Learning_step_a = 0.0005; % step parameter
%% Run Gradient descent
fprintf('Running Gradient Descent ...\n')
for iter = 1:iterations
% In every iteration calculate objective function
h= Alpha.*x(:,2)+ Beta.*x(:,1);
% Update line variables
Alpha=Alpha - Learning_step_a * (1/m)* sum((h-y).* x(:,2));
Beta=Beta - Learning_step_a * (1/m) * sum((h-y).*x(:,1));
end
% This is my linear Model
LinearModel=Alpha.*x(:,2)+ Beta.*x(:,1);
% I pass it through a sigmoid !
LogisticRegressionPDF = 1 ./ (1 + exp(-LinearModel));
% Make a prediction for p(y==1|x==10)
Prediction1=LogisticRegressionPDF(10);
%% Confirmation with matlab function mnrfit
B=mnrfit(x(:,2),y+1); % Find Logistic Regression Coefficients
mnrvalPDF = mnrval(B,x(:,2));
% Make a prediction .. p(y==1|x==10)
Prediction2=mnrvalPDF(10,2);
%% Plotting Results
% Plot Logistic Regression Results ...
figure;
plot(x(:,2),y,'g*');
hold on
plot(x(:,2),LogisticRegressionPDF,'k--');
hold off
title('My Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function');
% Plot Logistic Regression Results (mnrfit) ...
figure,plot(x(:,2),y,'g*');
hold on
plot(x(:,2),mnrvalPDF(:,2),'--k')
hold off
title('mnrval Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function')
为什么我的情节(只要预测)对于每个案例都不一样?
答案 0 :(得分:1)
我使用梯度下降法开发了自己的逻辑回归算法。对于"好"训练数据,我的算法别无选择,只能收集与mnrfit相同的解决方案。对于不太好的"训练数据,我的算法没有与mnrfit关闭。系数和相关模型可以很好地预测结果,但不如mnrfit。绘制残差显示mnrfit的残差几乎为零9x10 -200,相比之下接近于零(0.00001)。我尝试改变alpha,步数和初始theta猜测,但这样做只会产生不同的theta结果。当我用一个好的数据集调整这些参数时,我的theta开始与mnrfit更好地融合。
答案 1 :(得分:1)
非常感谢user3779062的信息。 PDF文件里面是我想要的。我已经实现了随机梯度下降,因此实现Logistic回归的唯一区别是通过for循环中的sigmoid函数更新假设函数,并且只要更新thetas规则中的符号就更改顺序。结果与mnrval相同。我为很多例子实现了代码,结果大多数时候都是一样的(特别是如果数据集很好并且在两个类中都有很多信息)。我附上最终代码和结果集的随机结果。
% Machine Learning : Logistic Regression
% Logistic regression is working as linear regression but as an output
% specifies the propability to be attached to one category or the other.
% At the beginning we created a well defined data set that can be easily
% be fitted by a sigmoid function.
clear all; close all; clc;
% This example runs many times to compare a lot of results
for examples=1:10:100
clearvars -except examples
%% Creatte Training Data
% x is the continues input and y is the category of every output [1 or 0]
x = (1:100)'; % independent variables x(s)
y(1:examples) = 0; % Dependent variables y(s) -- class 0
y(examples+1:100) = 1; % Dependent variables y(s) -- class 1
y=y';
y = y(randperm(length(y))); % Random order of y array
x=[ones(length(x),1) x]; % This is done for vectorized code
%% Initialize Linear regression parameters
m = length(y); % number of training examples
% initialize fitting parameters - all zeros
Alpha = 0; % gradient
Beta = 0; % offset
% Some gradient descent settings
% iterations must be a big number because we are taking very small steps .
iterations = 100000;
% Learning step must be small because the line must fit the data between
% [0 and 1]
Learning_step_a = 0.0005; % step parameter
%% Run Gradient descent
fprintf('Running Gradient Descent ...\n')
for iter = 1:iterations
% Linear hypothesis function
h= Alpha.*x(:,2)+ Beta.*x(:,1);
% Non - Linear hypothesis function
hx = 1 ./ (1 + exp(-h));
% Update coefficients
Alpha=Alpha + Learning_step_a * (1/m)* sum((y-hx).* x(:,2));
Beta=Beta + Learning_step_a * (1/m) * sum((y-hx).*x(:,1));
end
% Make a prediction for p(y==1|x==10)
Prediction1=hx(10)
%% Confirmation with matlab function mnrfit
B=mnrfit(x(:,2),y+1); % Find Logistic Regression Coefficients
mnrvalPDF = mnrval(B,x(:,2));
% Make a prediction .. p(y==1|x==10)
Prediction2=mnrvalPDF(10,2)
%% Plotting Results
% Plot Logistic Regression Results ...
figure;
subplot(1,2,1),plot(x(:,2),y,'g*');
hold on
subplot(1,2,1),plot(x(:,2),hx,'k--');
hold off
title('My Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function');
% Plot Logistic Regression Results (mnrfit) ...
subplot(1,2,2),plot(x(:,2),y,'g*');
hold on
subplot(1,2,2),plot(x(:,2),mnrvalPDF(:,2),'--k')
hold off
title('mnrval Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function')
end
结果..
非常感谢!!