我有以下时间序列:
b = [2 5 110 113 55 115 80 90 120 35 123];
b
中的每个数字都是一个时刻的一个数据点。我从b
计算了持续时间值。持续时间由b
内的所有数字表示,大于或等于100并连续排列(所有其他数字都被丢弃)。允许一个小于100的最大间隙。这就是持续时间代码的样子:
N = 2; % maximum allowed gap
duration = cellfun(@numel, regexp(char((b>=100)+'0'), [repmat('0',1,N) '+'], 'split'));
为b
提供以下持续时间值:
duration = [4 3];
我想在b
中找到duration
中每个值的位置(时间线)。接下来,我想用零替换位于duration
之外的其他位置。结果如下:
result = [0 0 3 4 5 6 0 0 9 10 11];
如果有人可以提供帮助,那就太好了。
答案 0 :(得分:1)
这是一种使用正则表达式检测所需模式的方法。我假设仅在(不是之后)值&gt; = 100之间允许一个值<100。因此,模式是:一个或多个值&gt; = 100,其中可能的值<100。
b = [2 5 110 113 55 115 80 90 120 35 123]; %// data
B = char((b>=100)+'0'); %// convert to string of '0' and '1'
[s, e] = regexp(B, '1+(.1+|)', 'start', 'end'); %// find pattern
y = 1:numel(B);
c = any(bsxfun(@ge, y, s(:)) & bsxfun(@le, y, e(:))); %// filter by locations of pattern
y = y.*c; %// result
这给出了
y =
0 0 3 4 5 6 0 0 9 10 11
需要修改regexp,它必须作为 n 的函数动态构建:
b = [2 5 110 113 55 115 80 90 120 35 123]; %// data
n = 2;
B = char((b>=100)+'0'); %// convert to string of '0' and '1'
r = sprintf('1+(.{1,%i}1+)*', n); %// build the regular expression from n
[s, e] = regexp(B, r, 'start', 'end'); %// find pattern
y = 1:numel(B);
c = any(bsxfun(@ge, y, s(:)) & bsxfun(@le, y, e(:))); %// filter by locations of pattern
y = y.*c; %// result
答案 1 :(得分:0)
这是另一种解决方案,而不是regexp
。它自然地推广到任意间隙大小和阈值。不确定是否有更好的方法填补空白。评论中的解释:
% maximum step size and threshold
N = 2;
threshold = 100;
% data
b = [2 5 110 113 55 115 80 90 120 35 123];
% find valid data
B = b >= threshold;
B_ind = find(B);
% find lengths of gaps
step_size = diff(B_ind);
% find acceptable steps (and ignore step size 1)
permissible_steps = 1 < step_size & step_size <= N;
% find beginning and end of runs
good_begin = B_ind([permissible_steps, false]);
good_end = good_begin + step_size(permissible_steps);
% fill gaps in B
for ii = 1:numel(good_begin)
B(good_begin(ii):good_end(ii)) = true;
end
% find durations of runs in B. This finds points where we switch from 0 to
% 1 and vice versa. Due to padding the first match is always a start of a
% run, the last one always an end. There will be an even number of matches,
% so we can reshape and diff and thus fidn the durations
durations = diff(reshape(find(diff([false, B, false])), 2, []));
% get positions of 'good' data
outpos = zeros(size(b));
outpos(B) = find(B);