Matlab中点集群的回归线

时间:2014-02-10 01:15:57

标签: matlab 2d regression points

我可以在Matlab中用一组x,y点绘制回归线。但是,如果我有一组点(如下图所示),说我有四组点,我想为它们绘制四条回归线。我该怎么做?所有点都保存在x,y中。没有办法将它们分开并将它们分成四组不同的变量。

见下图。忽略图例和标签。知道我怎么能在Matlab中做到这一点?如果只有一个集群,我可以做到。但我想同时为所有四个集群做。enter image description here

我正在使用的代码用于一个群集:

 %----------- Linear regression -----------------
 p= polyfit(x,y,1);
 f= polyval(p,x);
 %----------- Call R-square function ------------
 r2=Rsquare(x,y,p);


 %------------- Plot data -----------------------
 figure()
 plot(x,y,'*k');hold on
 plot(x,f,'-r'); % show linear fit
 xlabel('index');
 ylabel('Intensity a.u.');
 title('Test: Linear regreesion && R-square');
 %------- Show y-data on current figure ---------
 [row col]=size(y);
 for i=1:col
 str=num2str(y(i)); 
 text(x(i),y(i),str,'Color',[0 0 1]);
 end
 %--Show linear equation on current figure -------
 m1=num2str(p(1));c1=num2str(p(2));Rsquare1=num2str(r2(1));
 text(1.05,80,['y= ',m1,'x+',c1,' , R^2= ',Rsquare1,'.'],'FontSize',10,'FontName','Times New           Roman');

1 个答案:

答案 0 :(得分:3)

您必须将您的值分成群集。这是一项非常重要的操作。这可以通过统计工具箱中的kmeans来完成,例如:

%// First, I generate some example data in 4 clusters. 

%// intercepts
a = [4 7  0 -5];

%// slopes
b = [0.7 1.0 1.0 0.8];

%// ranges
xmin = [+1  -6  -6  +1];
xmax = [+6  -1  -1  +6];

%// generate clusters 
N = [30 40 25 33];
X = arrayfun(@(ii) (xmax(ii)-xmin(ii))*rand(N(ii),1) + xmin(ii), 1:4, 'UniformOutput', false);
Y = arrayfun(@(ii) a(ii) + b(ii)*X{ii} + randn(size(X{ii})), 1:4, 'UniformOutput', false);


%// Unfortunately, your points not are given in 4 separate clusters, but 
%// in a single array:
X = cat(1,X{:});
Y = cat(1,Y{:});

%// Therefore, you'll have to separate the data again into clusters: 
idx = kmeans([X,Y], 4, 'Replicates', 2);

X = {
    X(idx==1)
    X(idx==2)
    X(idx==3)
    X(idx==4)
};

Y = {
    Y(idx==1)
    Y(idx==2)
    Y(idx==3)
    Y(idx==4)
};


%// Now perform regression on each cluster
ab = arrayfun(@(ii) [ones(size(X{ii})) X{ii}]\Y{ii}, 1:4, 'UniformOutput', false);

%// the original values, and the computed ones
%// note that the order is not the same!
[a; b]
[ab{:}]

%// Plot everything for good measure
figure(1), clf, hold on

plot(...
    X{1}, Y{1}, 'g.',...
    X{2}, Y{2}, 'b.',...
    X{3}, Y{3}, 'r.',...
    X{4}, Y{4}, 'c.')

line([min(X{1}); max(X{1})], ab{1}(1) + ab{1}(2)*[min(X{1}); max(X{1})], 'color', 'k')
line([min(X{2}); max(X{2})], ab{2}(1) + ab{2}(2)*[min(X{2}); max(X{2})], 'color', 'k')
line([min(X{3}); max(X{3})], ab{3}(1) + ab{3}(2)*[min(X{3}); max(X{3})], 'color', 'k')
line([min(X{4}); max(X{4})], ab{4}(1) + ab{4}(2)*[min(X{4}); max(X{4})], 'color', 'k')

结果:

ans =
    4.0000    7.0000         0   -5.0000
    0.7000    1.0000    1.0000    0.8000
ans =
   -4.6503    6.4531    4.5433   -0.6326
    0.7561    0.8916    0.5914    0.7712

enter image description here

考虑到不同的顺序(看情节中的颜色),这些结果确实是你所期望的,因为我施加了很大程度的噪音:)