我有以下代码:
colBIN = {0.050, 0.055, 0.060, 0.065, 0.070, 0.075, 0.080, 0.085, 0.090, 0.095,0.1};
for i = 1 : length(colBIN)-1
colBIN{i,2} = find(cols(:,1) <= cell2mat(colBIN(i+1,1)) & cols(:,1) > cell2mat(colBIN(i,1)));
end
rowBIN = {0.045, 0.046, 0.047, 0.048, 0.049, 0.050, 0.051, 0.052};
for i = 1 : length(rowBIN)-1
rowBIN{i,2} = find(rows(:,1) <= cell2mat(rowBIN(i+1,1)) & rows(:,1) > cell2mat(rowBIN(i,1)));
end
binCombos = cell(length(rowBIN)-1,length(colBIN)-1);
for m = 1 : length(rowBIN)-1
for n = 1 : length(colBIN)-1
binCombos{n,m} = intersect( rowBIN{m,2}(:,1),colBIN{n,2}(:,1));
end
end
binRows = size(binCombos,1);
binCols = size(binCombos,2)-1;
j = j + 1;
for n = 1 : binRows;
for m = 1 : binCols;
thisBin = binCombos{n,m}(:,:);
if isempty(thisBin)==0
%polyfit
quadmod = polyfit(x_vrbl(thisBin), y_vrbl(thisBin), 2);
interval = 0.0:0.001:1;
quadmodcurve = polyval(quadmod,interval);
[r2 rmse] = rsquare(y_vrbl(thisBin), quadmodcurve);
plot(x_vrbl(thisBin), y_vrbl(thisBin), '*', interval, quadmodcurve);
xlabel('x_vrbl');
ylabel('y_vrbl');
axis([0,1,0,1]);
header = ['R^2 =' num2str(r2),'coeffs:',num2str(quadmod)];
title(header);
saveas(gcf, sprintf('plot_%d.pdf', j));
%residuals
res = y_vrbl(thisBin) - quadmodcurve;
plot(x_vrbl(thisBin),res,'+');
header2 = ['residuals'];
title(header2);
saveas(gcf, sprintf('residuals_%d.pdf', j));
end
j = j + 1;
end
end
说明/问题:
binCombos
是二维单元阵列,每个单元具有不均匀数量的数据点。我将二次曲线拟合到每个唯一单元格的数据,并尝试(不成功)输出 R ^ 2值以及绘制残差。
我认为问题与以下事实有关:polyval
函数所需的'interval'与y_vrbl(thisBin)
的数组大小在尝试查找rsquare时不匹配,同样也用于计算残差。例如,如果我设置interval = x_vrbl(thisBin)
,那么残差“工作”但是polyfit都搞砸了。
答案 0 :(得分:0)
我的猜测是这应该有效:
quadmodcurve = polyval(quadmod,y_vrbl(thisBin));
[r2 rmse] = rsquare(y_vrbl(thisBin), quadmodcurve);
interval = 0.0:0.001:1;
quadmodcurve = polyval(quadmod,interval);
为了确定拟合质量,您必须仅在样本的x值处评估多项式。为了绘制完整的多项式图,您需要以更多且规则间隔的x值来评估它。
答案 1 :(得分:0)
我设法使用http://dropproxy.com/f/4B6的数据和file exchange的rsquare函数运行代码 纠正一些错误之后:
d = importdata('sample_data.xlsx');
y_vrbl = d.data(:, 1);
x_vrbl = d.data(:, 2);
rows = d.data(:, 3);
cols = d.data(:, 4);
cb = {0.050, 0.055, 0.060, 0.065, 0.070, 0.075, 0.080, 0.085, 0.090, 0.095,0.1};
for i = 1 : length(cb)-1
colBIN{i,2} = find(cols(:,1) <= cell2mat(cb(i+1)) & cols(:,1) > cell2mat(cb(i)));
end
rb = {0.045, 0.046, 0.047, 0.048, 0.049, 0.050, 0.051, 0.052};
for i = 1 : length(rb)-1
rowBIN{i,2} = find(rows(:,1) <= cell2mat(rb(i+1)) & rows(:,1) > cell2mat(rb(i)));
end
binCombos = cell(length(rowBIN)-1,length(colBIN)-1);
for m = 1 : length(rowBIN)-1
for n = 1 : length(colBIN)-1
binCombos{n,m} = intersect( rowBIN{m,2}(:,1),colBIN{n,2}(:,1));
end
end
binRows = size(binCombos,1);
binCols = size(binCombos,2)-1;
j = 1;
for n = 1 : binRows;
for m = 1 : binCols;
thisBin = binCombos{n,m}(:,:);
if ~isempty(thisBin)
% polyfit
quadmod = polyfit(x_vrbl(thisBin), y_vrbl(thisBin), 2);
% compute residuals and R²
quadmodcurve = polyval(quadmod,y_vrbl(thisBin));
[r2, rmse] = rsquare(y_vrbl(thisBin), quadmodcurve);
res = y_vrbl(thisBin) - quadmodcurve;
% plot fit
interval = 0.0:0.001:1;
quadmodcurve = polyval(quadmod,interval);
plot(x_vrbl(thisBin), y_vrbl(thisBin), '*', interval, quadmodcurve);
xlabel('x_vrbl');
ylabel('y_vrbl');
axis([0,1,0,1]);
header = ['R^2 =' num2str(r2),'coeffs:',num2str(quadmod)];
title(header);
saveas(gcf, sprintf('plot_%d.pdf', j));
% plot residuals
plot(x_vrbl(thisBin),res,'+');
header2 = ['residuals'];
title(header2);
saveas(gcf, sprintf('residuals_%d.pdf', j));
end
j = j + 1;
end
end
这种拟合对我来说很好,除了在大多数情况下线性函数可能就够了,并且二次项不是必需的。
关于你剩下的问题:我不是使用R²进行非线性拟合的专家(见Coefficient of determination上的注释2),但你使用的实现对我来说似乎有点可疑。大多数时候输出为0的原因是max
的第65行上的rsquare.m
函数,它可以防止返回负值。由于多项式拟合确实包含常数项,因此将函数调用为
[r2, rmse] = rsquare(y_vrbl(thisBin), quadmodcurve, false);
似乎更合适,并导致R 2>在大多数情况下为0.9。
我的建议:检查R²是否是您的情况下适合度的正确度量,并检查该功能是否正确实现。 Matlab附带的功能可以开箱即用,但Matlab文件交换中的帖子没有质量保证。