我正在尝试编写一个代码,用于相对于糖和纤维的营养评级进行回归。并计算平方残差之和R2和估计的标准误s。
我是Matlab的新手,但一直遵循https://www.mathworks.com/help/matlab/data_analysis/linear-regression.html
中的方法 z=csvread('Cereals no alpha.csv');
[rows,cols]=size(z);
disp([rows,cols])
sugar=z(:,7);
fiber=z(:,5);
rating=z(:,13);
%Regression
V=ones(rows,3);
V(:,2)=sugar;
V(:,3)=fiber;
A=V'*V;
b=V'*rating;
w =A\b;
b0=w(1);
bs=w(2);
bf=w(3);
disp([b0,bs,bf])
%Sum squared error performance function
%perf = sse(z,sugar,fiber);
%disp(perf)
%R^2 maybe?
%Use polyfit to compute a linear regression that predicts y from x:
p = polyfit(sugar,fiber,1)
%fit equation
yfit =p(1)*sugar+p(2);
%Compute the residual values as a vector of signed numbers:
yresid = fiber-yfit;
%Square the residuals and total them to obtain the residual sum of
SSresid =sum(yresid.^2);
%Compute the total sum of squares of y by multiplying the variance of
by the number of observations minus 1:
SStotal = (length(fiber)-1) * var(fiber);
%Compute R2
rsq = 1 - SSresid/SStotal
给予: disp([b0,bs,bf])= 51.7635 -2.2012 2.8661
p = -0.0761 2.6797
rsq = 0.0199
但是根据数据,我预计回归的R2约为80.8%,s = 6.24