在Matlab中,矢量化代码比for循环慢

时间:2015-04-23 14:23:23

标签: matlab for-loop matrix vectorization

我有一个名为gimg的矩阵8x8。我用这个代码为5个不同的gimg矩阵执行了这个代码,一个是矢量化的,另一个是for循环。

tic
dm = zeros(size(gimg));

for x = 1:size(gimg, 1)
    for y = 1:size(gimg, 2)
        dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
    end
end
toc

tic
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));  

dm = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
toc

以下是结果,

Elapsed time is 0.000057 seconds.
Elapsed time is 0.000247 seconds.

Elapsed time is 0.000062 seconds.
Elapsed time is 0.000199 seconds.

Elapsed time is 0.000056 seconds.
Elapsed time is 0.000195 seconds.

Elapsed time is 0.000055 seconds.
Elapsed time is 0.000192 seconds.

Elapsed time is 0.000056 seconds.
Elapsed time is 0.000187 seconds.

是不是因为矩阵?

我发现matlab中的特征加速会为for循环显着改变时间。所以我的问题是,使用JIT编译器的这些功能现在是否值得对代码进行矢量化?

更新: 这是我的gimg矩阵的一个例子

gimg =

         259          42           0           0           0           0           0           0
          42        1064          41           0           0           0           0           0
           0          55        3444         196           0           0           0           0
           0           0         215        3581          47           0           0           0
           0           0           0         100         806           3           0           0
           0           0           0           0           3           2           0           0
           0           0           0           0           0           0           0           0
           0           0           0           0           0           0           0           0

更新2:来自@Divakar代码的结果

>> test_vct
------------------------ With Original Loopy Approach
Elapsed time is 5.269883 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 6.314792 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 3.146764 seconds.
>> 

因此,在我的计算机中,原始的矢量化方法仍然较慢。

我的电脑规格和Matlab版本

  • Matlab 2015a
  • Windows 8.1 x64
  • Intel i7 860 2.80 Ghz
  • 16 Gb RAM
  • Nvidia Geforce GTS250

1 个答案:

答案 0 :(得分:4)

这似乎比两者都要快 -

dm = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);

基准代码 -

%// Random input
gimg = rand(8,8);

%// Number of trials (keep this a big number, as so to get runtimes of 1sec+
num_iter = 100000;

disp('------------------------ With Original Loopy Approach')
tic
for iter = 1:num_iter
    dm = zeros(size(gimg));     
    for x = 1:size(gimg, 1)
        for y = 1:size(gimg, 2)
            dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
        end
    end
end
toc

disp('------------------------ With Original Vectorized Approach')
tic
for iter = 1:num_iter
    [x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
    dm2 = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
end
toc

disp('------------------------ With Proposed Vectorized Approach')
tic
for iter = 1:num_iter
    dm3 = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
end
toc

结果 -

------------------------ With Original Loopy Approach
Elapsed time is 4.996531 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 2.684011 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 1.338118 seconds.