Question

enter image description here

我有一个如上所述的简单loglog曲线。 Matlab中是否有一些函数可以通过分段线拟合这条曲线并显示这些线段的起点和终点？我在matlab中检查了曲线拟合工具箱。它们似乎通过一条线或一些函数进行曲线拟合。我不想只用一条曲线拟合。

如果没有直接功能，任何实现相同目标的替代方案都适合我。我的目标是通过分段线拟合曲线并获得这些段的终点位置。

Answer 1

首先，您的问题不称为曲线拟合。曲线拟合就是在您拥有数据时，从某种意义上说，您可以找到描述它的最佳函数。另一方面，您希望创建函数的分段线性逼近。

我建议采用以下策略：

手动拆分为多个部分。截面尺寸应取决于导数，大导数 - >小节
在各部分之间的节点处对功能进行采样
找到通过上述各点的线性插值。

以下是执行此操作的代码示例。您可以看到红线（插值）非常接近原始函数，尽管部分数量很少。这是由于自适应部分大小而发生的。

enter image description here

function fitLogLog()
   x = 2:1000;
   y = log(log(x));

   %# Find section sizes, by using an inverse of the approximation of the derivative
   numOfSections = 20;
   indexes = round(linspace(1,numel(y),numOfSections));
   derivativeApprox = diff(y(indexes));
   inverseDerivative = 1./derivativeApprox;
   weightOfSection =  inverseDerivative/sum(inverseDerivative);   
   totalRange = max(x(:))-min(x(:));   
   sectionSize = weightOfSection.* totalRange;

   %# The relevant nodes
   xNodes = x(1) + [ 0 cumsum(sectionSize)];
   yNodes = log(log(xNodes));

   figure;plot(x,y);   
   hold on;
   plot (xNodes,yNodes,'r');
   scatter (xNodes,yNodes,'r');
   legend('log(log(x))','adaptive linear interpolation');
end

Answer 2

Andrey的自适应解决方案提供更准确的整体适应性。但是，如果您想要的是固定长度的片段，那么这里应该可以使用一种方法，该方法也返回所有拟合值的完整集合。如果需要速度，可以进行矢量化。

Nsamp = 1000;     %number of data samples on x-axis
x = [1:Nsamp];    %this is your x-axis
Nlines = 5;       %number of lines to fit

fx = exp(-10*x/Nsamp);  %generate something like your current data, f(x)
gx = NaN(size(fx));     %this will hold your fitted lines, g(x)

joins = round(linspace(1, Nsamp, Nlines+1));  %define equally spaced breaks along the x-axis

dx = diff(x(joins));   %x-change
df = diff(fx(joins));  %f(x)-change

m = df./dx;   %gradient for each section

for i = 1:Nlines
   x1 = joins(i);   %start point
   x2 = joins(i+1); %end point
   gx(x1:x2) = fx(x1) + m(i)*(0:dx(i));   %compute line segment
end

subplot(2,1,1)
h(1,:) = plot(x, fx, 'b', x, gx, 'k', joins, gx(joins), 'ro');
title('Normal Plot')

subplot(2,1,2)
h(2,:) = loglog(x, fx, 'b', x, gx, 'k', joins, gx(joins), 'ro');
title('Log Log Plot')

for ip = 1:2
    subplot(2,1,ip)
    set(h(ip,:), 'LineWidth', 2)
    legend('Data', 'Piecewise Linear', 'Location', 'NorthEastOutside')
    legend boxoff
end

MATLAB plotted output

Answer 3

这不是这个问题的确切答案，但是因为我是基于搜索到达这里的，所以我想回答关于如何创建（不适合）分段线性函数的相关问题，该函数旨在表示散点图中间隔数据的平均值（或中位数或其他一些其他函数）。

首先，使用回归的一个相关但更复杂的替代方案，显然是some MATLAB code listed on the wikipedia page，Multivariate adaptive regression splines。

这里的解决方案是仅计算重叠间隔的平均值以获得分数

function [x, y] = intervalAggregate(Xdata, Ydata, aggFun, intStep, intOverlap)
% intOverlap in [0, 1); 0 for no overlap of intervals, etc.
% intStep    this is the size of the interval being aggregated.

minX = min(Xdata);
maxX = max(Xdata);

minY = min(Ydata);
maxY = max(Ydata);

intInc = intOverlap*intStep; %How far we advance each iteraction.
if intOverlap <= 0
   intInc = intStep;
end
nInt = ceil((maxX-minX)/intInc); %Number of aggregations

parfor i = 1:nInt
    xStart = minX + (i-1)*intInc;
    xEnd   = xStart + intStep;
    intervalIndices = find((Xdata >= xStart) & (Xdata <= xEnd));
    x(i) = aggFun(Xdata(intervalIndices));
    y(i) = aggFun(Ydata(intervalIndices));
end

例如，为了计算一些配对的X和Y数据的平均值，我得到了方便的长度为0.1的区间，彼此大约有1/3重叠（见散点图）：

Scatter plot example for Xdat and Ydat

[x，y] = intervalAggregate（Xdat，Ydat，@ mean，0.1,0.333）

x =

第1至8栏

0.0552    0.0868    0.1170    0.1475    0.1844    0.2173    0.2498    0.2834

第9至15栏

0.3182    0.3561    0.3875    0.4178    0.4494    0.4671    0.4822

y =

第1至8栏

0.9992    0.9983    0.9971    0.9955    0.9927    0.9905    0.9876    0.9846

第9至15栏

0.9803    0.9750    0.9707    0.9653    0.9598    0.9560    0.9537

我们看到随着x增加，y趋于略微减少。从那里，很容易绘制线段和/或执行其他一些平滑。

（请注意，我没有尝试对此解决方案进行矢量化;如果对Xdata进行排序，则可以假设速度更快。）

如何在Matlab中通过一系列分段线拟合曲线？

3 个答案: