Question

我正在尝试优化Matlab代码，以便对大量数据（1e6值）进行统计计算。我尝试了几种方法，包括循环或有趣的函数，使用diff或基本数学。基本上我需要计算一组数据的累加量和它的标准偏差。

我无法在24秒内运行。有没有办法在不使用其他工具箱的情况下改进此代码？

这是我到现在为止所尝试的：

clear
close
myData = rand(1e5, 1)/5e6;

M = 1000;
N = length(myData)-M;

PkPk  = NaN(M, 1);
Std  = NaN(M, 1);
myMat = NaN (1, N);


%%%%%%%%%%%%%%%%%%%%%%%%%% peak2peak is part of  Signal Processing Toolbox:
%%%%%%%%%%%%%%%%%%%%%%%%%% can use max()-min()
tic
for x = 1  : M
     myMat =    diff( (reshape(myData(1:x*floor(N/x)),x,floor(N/x)))')   ;
    PkPk (x) = peak2peak(myMat(:)) ;
    Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat));
end
Time1 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1  : M
    myMat =  bsxfun(@minus,  myData(x+1 : x+N) , myData(1:N)) '; % EDIT HERE: transpose
    PkPk (x) = peak2peak(myMat(:)) ; % max - min
    Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
Time2 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1  : M
    myMat =   myData(x+1 : x+N) - myData(1:N);%
    PkPk (x) = peak2peak(myMat(:)) ; % max - min
    Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
Time3 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic 
for x = 1  : M
Std(x) = std( reshape( diff(reshape( myData(1:x*floor(N/x))  , x ,floor(N/x))'),  floor(N/x)' * x -x, 1    )  ) ;
PkPk(x) = peak2peak( reshape( diff(reshape( myData(1:x*floor(N/x))  , x ,floor(N/x))'),  floor(N/x)' * x -x, 1    )  ); 
end
Time4 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1  : M
PkPk (M) = peak2peak( myData(x+1 : x+N) - myData(1:N)) ;
Std(M) = std( myData(x+1 : x+N) - myData(1:N)) ;
end
Time5 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic 
PkPk = (cellfun(@(x)  peak2peak( reshape( diff(reshape( myData(1:x*floor(N/x))  , x ,floor(N/x))'),  floor(N/x)' * x -x, 1    )  )  ,  num2cell(1:M)   ));
Std = (cellfun(@(x)  std( reshape( diff(reshape( myData(1:x*floor(N/x))  , x ,floor(N/x))'),  floor(N/x)' * x -x, 1    )  )  ,  num2cell(1:M)   ));
Time6 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic  
PkPk = cellfun( @(x)  peak2peak(    myData(x:N+x-1) - myData(1:N)     )    ,  num2cell(1:M) )  ;
Std = cellfun( @(x)  std(    myData(x:N+x-1) - myData(1:N)     )    ,  num2cell(1:M) )  ;
Time7 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%

tic 
Std = cellfun( @(x)  std( myData(x+1 : x+N) - myData(1:N)), num2cell(1:M) ) ;
PkPk  = cellfun( @(x)  max( myData(x+1 : x+N) - myData(1:N)) - min( myData(x+1 : x+N) - myData(1:N)) , num2cell(1:M) );
Time8 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
Std = arrayfun( @(x)  std( myData(x+1 : x+N) - myData(1:N)), (1:M) ) ;
PkPk  = arrayfun( @(x)  peak2peak( myData(x+1 : x+N) - myData(1:N))  , (1:M) );
Time9 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%

这是我的时间结果（以秒为单位）：

Time1: 24.47
Time2: 23.56
Time3: 25.20
Time4: 45.44
Time5: 42.99
Time6: 46.27
Time7: 43.62
Time8: 62.49
Time9: 41.69

谢谢！

Answer 1

我采用了您的第二个解决方案（在您的基准测试中速度最快）并进行了一些修改。

如果你停止在每个循环迭代中进行myData(1:N)并在循环之前将其分配给数组，就可以实现性能提升，如下所示：

tic
myData1toN = myData(1:N);
for x = 1  : M
    myMat =  bsxfun(@minus,  myData(x+1 : x+N) , myData1toN);
    PkPk (x) = peak2peak(myMat(:)) ; % max - min
    Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
clear myData1toN;
Time2 = toc

之前的时间：

Time2: 20.5618

之后的时间：

Time2: 14.2260

另一项修改：sum(sum(...可以更改为sum(...，因为外部总和只是将单个值相加。

之后的时间：

Time2: 11.6573

顺便说一下，numel(myMat)可以替换为N，但我没有注意到性能提升。

Matlab大数组（1e6值）计算速度

1 个答案: