如何在Matlab的lsqcurvefit的成本函数中考虑对数距离?

时间:2019-04-04 08:09:00

标签: matlab curve-fitting

当我使用lsqcurvefit校准任意函数fun的参数时

fun = @(p,x)(p(1)./x .* 1./(p(3)*sqrt(2*pi)).*exp(-(log(x)-p(2)).^2./(2*p(3)^2)));

在两种情况下p_inp_out通过这样做

opts = optimoptions('lsqcurvefit','TolX',1e-4,'TolFun',1e-8);
p0 = [1,1,1];
p_in = lsqcurvefit(fun,p0,xR_in,yR_in,[],[],opts);
p_out = lsqcurvefit(fun,p0,xR_out,yR_out,[],[],opts);

我乍看之下两条曲线都非常合适

Fitted curves in linear scale

但是,如果我以对数刻度看相同的结果,则拟合度看起来不再那么好了:

Fitted curves in linear scale

这使我想知道是否有可能考虑lsqcurvefit的成本函数中的对数误差,这基本上意味着要查看函数的每个x值的相对误差。 / p>

Matlab是否有可能执行此操作?

示例数据:

xR_in =[0.0649, 0.0749, 0.0865, 0.1000, 0.1155, 0.1334, 0.1540, 0.1779, 0.2054, 0.2371, 0.2739, 0.3162, 0.3651, ...
        0.4217, 0.4870, 0.5623, 0.6494, 0.7498, 0.8660, 1.0000, 1.1548, 1.3335, 1.5399, 1.7782, 2.0535, 2.3714, ...
        2.7384, 3.1623, 3.6517, 4.2170, 4.8696, 5.6234, 6.4938, 11.5477   13.3351];

xR_out = [0.487, 0.562, 0.649, 0.750, 0.866, 1.000, 1.155, 1.334, 1.540, 1.778, 2.054, 2.371, 2.738, 3.162, 3.652, ...
          4.217, 4.870, 5.623, 6.494, 7.499, 8.660, 10.000, 11.548, 13.335, 15.399, 17.783, 20.535, 23.714, 27.384, ...
          31.623, 36.517, 42.170, 48.697, 56.234, 64.938, 74.989, 86.596, 100.00];

yR_in = [0.00455681508591336, 0.00873409885607543, 0.0181688255259165, 0.0538821352554514, 0.117259031001522, ...
         0.193077037062548, 0.266434471781847, 0.325746362482833, 0.365146728802310, 0.386958383047475, ...
         0.403635741471857, 0.432215334550296, 0.485674792559743, 0.567693518475570, 0.668583592821924, ...
         0.768926132246355, 0.859501365700179, 0.933719545040531, 0.980404014586848, 1, 0.995168235818486, ...
         0.968831382298403, 0.918925456426510, 0.840263581309774, 0.730217907888984, 0.593965562217998, ...
         0.445252129988812, 0.302478487418430, 0.182455631766952, 0.0946623004432102, 0.0395136730456252, ...
         0.0110944993303772, 0.00304918005164052, 0.000176066405779416, 0.000271107188644644];

yR_out = [0.141149758427413, 0.327528725181928, 0.531429476303118, 0.734825800245580, 0.905142105752982, ...
          1, 0.994585150172040, 0.897227892811480, 0.737878031509986, 0.553911757285727, 0.378467001536220, ...
          0.235489445927972, 0.136533896783523, 0.0811622468252899, 0.0597202443876324, 0.0577324907072722, ...
          0.0649847546602922, 0.0739253772095743, 0.0770973088912625, 0.0732042102857309, 0.0642224578204625, ...
          0.0538581912412821, 0.0446084800059044, 0.0369556660640144, 0.0301643235397618, 0.0235320482975639, ...
          0.0170281463058672, 0.0111839380638397, 0.00660341791973352, 0.00352225498991166, 0.00174404920373791, ...
          0.000833106531219361, 0.000466707400714911, 0.000321368144365539, 0.000238498897310069, 0.000187001782051892, ...
          0.000152087422572435, 0.000120039487163363];

1 个答案:

答案 0 :(得分:1)

对于这两个数据集,我都能找到的最佳峰值方程是幂函数定律,它具有指数截止方程,“ y = C * pow(x,-1.0 * T)* exp(-1.0 * x / K)”,请参见下面两个数据集的图表,并注意x轴比例的差异。

对于“输入”数据,参数为:

C =  3.7599874401256059E+00
T = -1.4597138492114836E+00
K =  7.5655555953645282E-01

收益率R平方= 0.0.989和RMSE = 0.037

in_plot

“输出”数据既有一个尖锐的峰,又在尖锐峰的底部有一个“凸点”,没有一个单独的峰值函数可以建模-实际上,它有一个大的峰加上一个小得多且不那么尖锐的秒高峰。

对于“输出”数据,参数为:

C =  7.5124684986001625E+01
T = -4.9620437310832193E+00
K =  2.2970202935383399E-01

R平方= 0.984,RMSE = 0.045

out_plot