在一组可能有噪声的数据中,并且鉴于我知道实际数据应该均匀间隔峰值,我如何使用MATLAB检测实际所需数据?

时间:2016-08-21 17:48:50

标签: matlab statistics noise

我有一组测量数据,理论上应该只存储到达接收器的功率峰值,我知道这些峰值应该以4秒为间隔(至少大约是这样,因为在实际情况下我应该期待它偏离一点点。)

问题在于,系统还可以从我感兴趣的来源以外的其他来源接收随机数据,或者从同一来源接收回声,如图像示例所示: Example data

在此图片中,蓝色数据是真实数据红色数据是应忽略的随机数据

使用MATLAB(可能还有一些统计知识)来检测那些最有可能是想要的数据的最佳方法是什么? (有时“寄生虫”数据也可以间隔4秒,如果它是回声)

1 个答案:

答案 0 :(得分:1)

以下代码查找属于最长系列的时间标记,其间隙接近4的倍数。
该算法假设系列中可能缺少有效间隙(不搜索连续性)。

%T is the X coordinate of your graph (time tag).
%Notice: The amplitude is irrelevant here.
T = [1, 2, 5, 6, 7, 10, 12, 14];

%Create all possible combinations of indexes of T.
[Y, X] = meshgrid(1:length(T));

%G matrix is the combinations of all gaps:
%T(1) - T(1), T(2) - T(1), T(3) - T(1)...
%It is inefficient to compute all gaps (even in reverse and T(1) - T(1)),
%But it is a common way to solve problems using Matlab.
G = T(X) - T(Y);

%Ignore sign of gaps.
G = abs(G);

%Remove all gaps that are not multiple of 4 with 0.1 hysteresis.
%Remove gaps like 5, 11, and 12.7...
G((mod(G, 4) > 0.1) & (mod(G, 4) < 3.9)) = 0;

%C is a counter vector - counts all gaps that are not zeros.
%Now C holds the number of elements in the relevant series of each time sample.
C = sum(G > 0, 1);

%Only indexes belongs to the maximum series are valid.
ind = (C == max(C));

%Result: time tags belongs to the longest series.
resT = T(ind)

注意:
如果您正在寻找没有间隙的最长系列,您可以使用以下代码:

T = [1, 2, 5, 6, 7, 10, 12, 14];
len = length(T);
C = zeros(1, len);

for i = 1:len-1
    j = i;
    k = i+1;
    while (k <= len)
        gap = T(k) - T(j);
        if (abs(gap - 4) < 0.1)
            C(i) = C(i) + 1; %Increase series counter.

            %Continue searching from j forward.
            j = k;
            k = j+1;
        else
            k = k+1;
        end

        if (gap > 4.1)
            %Break series if gap is above 4.1
            break;
        end                
    end
end

%now find(C == max(C)) is the index of the beginning of the longest contentious series.