Question

我注意到某些函数在并行计算中较慢（对于每个循环而言不是总时间）这样的fft计算，并且这是因为fft2使用了使用多核的优化函数。

以下代码显示了fft2的实现：

function [ F ] =fft_implemented(A)
% equivalent to F=fft2(A)
M=size(A,1);
N=size(A,2);
[ x, y ] = meshgrid( 0 : M - 1, 0 : M - 1 );
a1 = exp( -2 * pi * 1i / M .* x .* y );
[x, y ] = meshgrid( 0 : N - 1, 0 : N - 1 );
a2 = exp( -2 * pi * 1i / N .* x .* y );
F = a1 * A * a2;
end

为了测试这个功能，我使用了这个脚本：

for i=1:5
    A{i}=rand(800,1280);
    m=A{i};
    tic
    fft_implemented(m);
    toc
end

结果是：

Elapsed time is 0.207859 seconds.
Elapsed time is 0.116945 seconds.
Elapsed time is 0.115507 seconds.
Elapsed time is 0.115516 seconds.
Elapsed time is 0.113433 seconds.

使用并行版本后，我发现：

parfor i=1:5
    A{i}=rand(800,1280);
    m=A{i};
    tic
    fft_implemented(m);
    toc
end
Elapsed time is 0.941441 seconds.
Elapsed time is 0.872370 seconds.
Elapsed time is 0.868988 seconds.
Elapsed time is 0.979503 seconds.
Elapsed time is 1.004280 seconds.

我不明白为什么在串行情况下并行（对于每个循环，而不是总执行时间）的速度较慢。

未优化的功能会减慢parfor中的计算速度

0 个答案: