我正在尝试优化Matlab代码,以便对大量数据(1e6值)进行统计计算。我尝试了几种方法,包括循环或有趣的函数,使用diff或基本数学。基本上我需要计算一组数据的累加量和它的标准偏差。
我无法在24秒内运行。有没有办法在不使用其他工具箱的情况下改进此代码?
这是我到现在为止所尝试的:
clear
close
myData = rand(1e5, 1)/5e6;
M = 1000;
N = length(myData)-M;
PkPk = NaN(M, 1);
Std = NaN(M, 1);
myMat = NaN (1, N);
%%%%%%%%%%%%%%%%%%%%%%%%%% peak2peak is part of Signal Processing Toolbox:
%%%%%%%%%%%%%%%%%%%%%%%%%% can use max()-min()
tic
for x = 1 : M
myMat = diff( (reshape(myData(1:x*floor(N/x)),x,floor(N/x)))') ;
PkPk (x) = peak2peak(myMat(:)) ;
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat));
end
Time1 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
myMat = bsxfun(@minus, myData(x+1 : x+N) , myData(1:N)) '; % EDIT HERE: transpose
PkPk (x) = peak2peak(myMat(:)) ; % max - min
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
Time2 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
myMat = myData(x+1 : x+N) - myData(1:N);%
PkPk (x) = peak2peak(myMat(:)) ; % max - min
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
Time3 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
Std(x) = std( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) ) ;
PkPk(x) = peak2peak( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) );
end
Time4 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
PkPk (M) = peak2peak( myData(x+1 : x+N) - myData(1:N)) ;
Std(M) = std( myData(x+1 : x+N) - myData(1:N)) ;
end
Time5 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
PkPk = (cellfun(@(x) peak2peak( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) ) , num2cell(1:M) ));
Std = (cellfun(@(x) std( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) ) , num2cell(1:M) ));
Time6 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
PkPk = cellfun( @(x) peak2peak( myData(x:N+x-1) - myData(1:N) ) , num2cell(1:M) ) ;
Std = cellfun( @(x) std( myData(x:N+x-1) - myData(1:N) ) , num2cell(1:M) ) ;
Time7 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
Std = cellfun( @(x) std( myData(x+1 : x+N) - myData(1:N)), num2cell(1:M) ) ;
PkPk = cellfun( @(x) max( myData(x+1 : x+N) - myData(1:N)) - min( myData(x+1 : x+N) - myData(1:N)) , num2cell(1:M) );
Time8 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
Std = arrayfun( @(x) std( myData(x+1 : x+N) - myData(1:N)), (1:M) ) ;
PkPk = arrayfun( @(x) peak2peak( myData(x+1 : x+N) - myData(1:N)) , (1:M) );
Time9 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
这是我的时间结果(以秒为单位):
Time1: 24.47
Time2: 23.56
Time3: 25.20
Time4: 45.44
Time5: 42.99
Time6: 46.27
Time7: 43.62
Time8: 62.49
Time9: 41.69
谢谢!
答案 0 :(得分:1)
我采用了您的第二个解决方案(在您的基准测试中速度最快)并进行了一些修改。
如果你停止在每个循环迭代中进行myData(1:N)
并在循环之前将其分配给数组,就可以实现性能提升,如下所示:
tic
myData1toN = myData(1:N);
for x = 1 : M
myMat = bsxfun(@minus, myData(x+1 : x+N) , myData1toN);
PkPk (x) = peak2peak(myMat(:)) ; % max - min
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
clear myData1toN;
Time2 = toc
之前的时间:
Time2: 20.5618
之后的时间:
Time2: 14.2260
另一项修改:sum(sum(...
可以更改为sum(...
,因为外部总和只是将单个值相加。
之后的时间:
Time2: 11.6573
顺便说一下,numel(myMat)
可以替换为N
,但我没有注意到性能提升。