我注意到某些函数在并行计算中较慢(对于每个循环而言不是总时间)这样的fft计算,并且这是因为fft2使用了使用多核的优化函数。
以下代码显示了fft2的实现:
function [ F ] =fft_implemented(A)
% equivalent to F=fft2(A)
M=size(A,1);
N=size(A,2);
[ x, y ] = meshgrid( 0 : M - 1, 0 : M - 1 );
a1 = exp( -2 * pi * 1i / M .* x .* y );
[x, y ] = meshgrid( 0 : N - 1, 0 : N - 1 );
a2 = exp( -2 * pi * 1i / N .* x .* y );
F = a1 * A * a2;
end
为了测试这个功能,我使用了这个脚本:
for i=1:5
A{i}=rand(800,1280);
m=A{i};
tic
fft_implemented(m);
toc
end
结果是:
Elapsed time is 0.207859 seconds.
Elapsed time is 0.116945 seconds.
Elapsed time is 0.115507 seconds.
Elapsed time is 0.115516 seconds.
Elapsed time is 0.113433 seconds.
使用并行版本后,我发现:
parfor i=1:5
A{i}=rand(800,1280);
m=A{i};
tic
fft_implemented(m);
toc
end
Elapsed time is 0.941441 seconds.
Elapsed time is 0.872370 seconds.
Elapsed time is 0.868988 seconds.
Elapsed time is 0.979503 seconds.
Elapsed time is 1.004280 seconds.
我不明白为什么在串行情况下并行(对于每个循环,而不是总执行时间)的速度较慢。