我有一组测量数据,理论上应该只存储到达接收器的功率峰值,我知道这些峰值应该以4秒为间隔(至少大约是这样,因为在实际情况下我应该期待它偏离一点点。)
问题在于,系统还可以从我感兴趣的来源以外的其他来源接收随机数据,或者从同一来源接收回声,如图像示例所示:
在此图片中,蓝色数据是真实数据,红色数据是应忽略的随机数据。
使用MATLAB(可能还有一些统计知识)来检测那些最有可能是想要的数据的最佳方法是什么? (有时“寄生虫”数据也可以间隔4秒,如果它是回声)
答案 0 :(得分:1)
以下代码查找属于最长系列的时间标记,其间隙接近4的倍数。
该算法假设系列中可能缺少有效间隙(不搜索连续性)。
%T is the X coordinate of your graph (time tag).
%Notice: The amplitude is irrelevant here.
T = [1, 2, 5, 6, 7, 10, 12, 14];
%Create all possible combinations of indexes of T.
[Y, X] = meshgrid(1:length(T));
%G matrix is the combinations of all gaps:
%T(1) - T(1), T(2) - T(1), T(3) - T(1)...
%It is inefficient to compute all gaps (even in reverse and T(1) - T(1)),
%But it is a common way to solve problems using Matlab.
G = T(X) - T(Y);
%Ignore sign of gaps.
G = abs(G);
%Remove all gaps that are not multiple of 4 with 0.1 hysteresis.
%Remove gaps like 5, 11, and 12.7...
G((mod(G, 4) > 0.1) & (mod(G, 4) < 3.9)) = 0;
%C is a counter vector - counts all gaps that are not zeros.
%Now C holds the number of elements in the relevant series of each time sample.
C = sum(G > 0, 1);
%Only indexes belongs to the maximum series are valid.
ind = (C == max(C));
%Result: time tags belongs to the longest series.
resT = T(ind)
注意:
如果您正在寻找没有间隙的最长系列,您可以使用以下代码:
T = [1, 2, 5, 6, 7, 10, 12, 14];
len = length(T);
C = zeros(1, len);
for i = 1:len-1
j = i;
k = i+1;
while (k <= len)
gap = T(k) - T(j);
if (abs(gap - 4) < 0.1)
C(i) = C(i) + 1; %Increase series counter.
%Continue searching from j forward.
j = k;
k = j+1;
else
k = k+1;
end
if (gap > 4.1)
%Break series if gap is above 4.1
break;
end
end
end
%now find(C == max(C)) is the index of the beginning of the longest contentious series.