矢量化嵌套for循环,填充动态编程表

时间:2014-09-27 00:48:00

标签: matlab image-processing optimization vectorization

我想知道是否有办法在此函数中对嵌套for循环进行矢量化,这填补了2D动态编程表DP的条目。我相信至少内部循环可以被矢量化,因为每行仅取决于前一行。我不知道怎么做。请注意,此函数在大型2D数组(图像)上调用,因此嵌套的for循环确实不会将其剪切掉。

function [cols] = compute_seam(energy)
    [r, c, ~] = size(energy);

    cols = zeros(r);

    DP = padarray(energy, [0, 1], Inf);    
    BP = zeros(r, c);

    for i = 2 : r        
        for j = 1 : c
            [x, l] = min([DP(i - 1, j), DP(i - 1, j + 1), DP(i - 1, j + 2)]);
            DP(i, j + 1) = DP(i, j + 1) + x;
            BP(i, j) = j + (l - 2);
        end
    end

    [~, j] = min(DP(r, :));
    j = j - 1;

    for i = r : -1 : 1
        cols(i) = j;
        j = BP(i, j);
    end
end

1 个答案:

答案 0 :(得分:4)

最内层嵌套循环的矢量化

你认为至少内环是可矢量化的是正确的。这是嵌套循环部分的修改代码 -

rows_DP = size(DP,1); %// rows in DP

%// Get first row linear indices for a group of neighboring three columns, 
%// which would be incremented as we move between rows with the row iterator
start_ind1 = bsxfun(@plus,[1:rows_DP:2*rows_DP+1]',[0:c-1]*rows_DP); %//'
for i = 2 : r
    ind1 = start_ind1 + i-2; %// setup linear indices for the row of this iteration
    [x,l] = min(DP(ind1),[],1); %// get x and l values in one go
    DP(i,2:c+1) = DP(i,2:c+1) + x; %// set DP values of a row in one go
    BP(i,1:c) = [1:c] + l-2; %// set BP values of a row in one go
end

基准

基准代码 -

N = 3000; %// Datasize
energy = rand(N);
[r, c, ~] = size(energy);

disp('------------------------------------- With Original Code')
DP = padarray(energy, [0, 1], Inf);
BP = zeros(r, c);
tic
for i = 2 : r
    for j = 1 : c
        [x, l] = min([DP(i - 1, j), DP(i - 1, j + 1), DP(i - 1, j + 2)]);
        DP(i, j + 1) = DP(i, j + 1) + x;
        BP(i, j) = j + (l - 2);
    end
end
toc,clear DP BP x l

disp('------------------------------------- With Vectorized Code')
DP = padarray(energy, [0, 1], Inf);
BP = zeros(r, c);
tic
rows_DP = size(DP,1); %// rows in DP
start_ind1 = bsxfun(@plus,[1:rows_DP:2*rows_DP+1]',[0:c-1]*rows_DP); %//'
for i = 2 : r
    ind1 = start_ind1 + i-2; %// setup linear indices for the row of this iteration
    [x,l] = min(DP(ind1),[],1); %// get x and l values in one go
    DP(i,2:c+1) = DP(i,2:c+1) + x; %// set DP values of a row in one go
    BP(i,1:c) = [1:c] + l-2; %// set BP values of a row in one go
end
toc

结果 -

------------------------------------- With Original Code
Elapsed time is 44.200746 seconds.
------------------------------------- With Vectorized Code
Elapsed time is 1.694288 seconds.

因此,通过少量矢量化调整,您可能会在性能方面获得良好的26x speedup


更多调整

为了提高性能,可以在代码中尝试更多优化调整 -

  • cols = zeros(r)可以替换为col(r,r) = 0

  • DP = padarray(energy, [0, 1], Inf)可以替换为 DP(1:size(energy,1),1:size(energy,2)+2)=Inf; DP(:,2:end-1) = energy;

  • BP = zeros(r, c)可以替换为BP(r, c) = 0

此处使用的预分配调整受this blog post的启发。

相关问题