如何通过下面的代码向量化来避免循环?

时间:2015-02-17 16:52:58

标签: matlab neural-network vectorization bsxfun

下面的代码是正确的,但我想对其进行矢量化(并可能转换为GPU)以提高速度。

如何将其转换为矢量形式?

RF = 4;     
inhibatory = 0;    
overlap=3;   
act_funct = 'sig';
gap = RF-overlap;    
Image1 = rand(30,22);  
Image2 = rand(27,19); % size_image2 is equal to 27x19
Image3 = rand(30,22); 
de_act_output = de_activate_Mat(Image1,act_funct); % finding derivative of the matrix. e.g. de_act_output = act_output.*(1-act_output) in case of sigmoid. 
for u=1:size(Image1,1)
    for v=1:size(Image1,2)
        sum_val=0;
        iLowMax=max(ceil((u-(RF+inhibatory))/(gap-inhibatory)),1);
        iHighMax=min(floor((u-1)/(gap-inhibatory))+1, size_image2(1));
        jLowMax=max(ceil((v-(RF+inhibatory))/(gap-inhibatory)),1);
        jHighMax = min(floor((v-1)/(gap-inhibatory))+1, size_image2(2));
        sum_sens = sum(sum(Image2(iLowMax:iHighMax,jLowMax:jHighMax)));
        sum_val = sum_sens(:,:) .* Image3(u,v);
        result(u,v) = de_act_output(u,v) .* sum_val;
    end
end

1 个答案:

答案 0 :(得分:1)

您在嵌套循环内创建的parallelogram-like块结构iLowMax:iHighMax,jLowMax:jHighMax不会导致 任何简单的矢量化代码。但是如果性能对你的情况至关重要,那么你就可以全力以赴地进行矢量化,看起来convolution在那里很有用。这里列出的是一些调整 通过预先计算大多数其他东西来加快这一步的速度,这必然会带来明显的加速。这是实施 -

U = 1:size(Image1,1); %// Create arrays of iteration steps
V = 1:size(Image1,2);

%// Calculate arrays of low-high row and column indices 
iLowMax=max(ceil((U-(RF+inhibatory))/(gap-inhibatory)),1);
iHighMax=min(floor((U-1)/(gap-inhibatory))+1, size_image2(1));

jLowMax=max(ceil((V-(RF+inhibatory))/(gap-inhibatory)),1);
jHighMax = min(floor((V-1)/(gap-inhibatory))+1, size_image2(2));

sens_sums(size(Image1,1),size(Image1,2)) = 0; %// Pre-allocation
for u=1:size(Image1,1)
    for v=1:size(Image1,2)
        sens = Image2(iLowMax(u):iHighMax(u),jLowMax(v):jHighMax(v));
        sens_sums(u,v) = sum(sens(:));
    end
end
result = sens_sums.*Image3.*de_act_output;