我为异常值检测构建了一个函数,它运行得很好,但是考虑到我正在处理的大量数据,我需要删除“for循环”,所以这里我们有矢量化版本(或者至少是什么我认为是我的代码的矢量化版本)。调用该函数以下参数由用户初始化,我正在使用以下内容:
alpha=3
gamma=0.5
k=5
系列“价格”存在于工作空间中,在调用函数时链接。 我想我几乎做到了,但我遇到了问题 这是一段代码:
[n] = size(price,1);
x = price;
[j1]=find(x); %output is a column vector with size (n,1) of the following form j1=[1:1:n]
matrix_left=zeros(n, k,'double');
matrix_right=zeros(n, k,'double');
toc
matrix_left(j1(k+1:end),:)=x(j1-k:j1-1);
%这里返回以下错误:下标索引必须是实数正整数或逻辑。
matrix_right(j1(1:end-k),:)=x(j1+1:j1+k);
%此处,它说明以下内容:订阅的分配维度不匹配。
matrix_group=[matrix_left matrix_right];
trimmed_mean=trimmean(matrix_group,10,'round',2);
score=bsxfun(@minus,x,trimmed_mean);
sd=std(matrix_group,2);
temp = abs(score) > (alpha .* sd + gamma);
outmat = temp*1;
我想拥有的是: 如果k = 5
left_matrix (3443,5):
[100.25 103.5 102.25 102.75 103] <---5 left neighbouring observations of the 15th row of **x**
[103.5 102.25 102.75 103 103.5] <---5 left neighbouring observations of the 16th row of **x**
right_matrix(3443,5):
[103.75 104.25 104 104.75 104.25] <---5 right neighbouring observations of the 15th row of **x**
[104.25 104 104.75 104.25 104.5] <---5 right neighbouring observations of the 16th row of **x**
以下是一小部分数据:
x = Price; price size = (3443, 1)
[...]
100.25 %// '*suppose here we are at the 10th row*'
103.5
102.25
102.75
103
103.5 %// '*here we are at the 15th row*'
103.75
104.25
104
104.75
104.25
104.5
[...]
Time (3443,1) %// the same as price, it reports the time of the transaction (HH:MM:SS).
j1 (3443,1)
1
2
[...]
3442
3443
提前感谢大家,
乔治
答案 0 :(得分:0)
这是答案,要走的路是(再一次)bsxfun:
[n] = size(price,1);
x = price;
idxArray_left=bsxfun(@plus,(k+1:n)',-k:-1);
idxArray_fill_left=bsxfun(@plus,(1:k)',1:k);
matrix_left=[idxArray_fill_left; idxArray_left];
idxArray_right=bsxfun(@plus,(1:n-k)',1:k);
idxArray_fill_right=bsxfun(@plus,(n-k+1:n)',-k:-1);
matrix_right=[idxArray_right; idxArray_fill_right];
idx_matrix=[matrix_left matrix_right];
neigh_matrix=x(idx_matrix);
trimmed_mean=trimmean(neigh_matrix,10,'round',2);
score=bsxfun(@minus,x,trimmed_mean);
sd=std(neigh_matrix,0,2);
temp = abs(score) > (alpha .* sd + gamma);
outmat = temp*1;
感谢Jonas给了我很好的解决问题的提示