如何减少for循环所消耗的时间?

时间:2016-03-17 16:27:38

标签: matlab for-loop image-processing vectorization

我正在尝试实现简单的像素级中心环绕图像增强。中心环绕技术利用窗口的中心像素与周围邻域之间的统计,作为决定需要进行哪些增强的手段。在下面给出的代码中,我将中心像素与周围信息的平均值进行了比较,并根据我在两种情况之间切换来增强对比度。我写的代码如下:

im = normalize8(im,1);     %to set the range of pixel from 0-255
s1 = floor(K1/2);          %K1 is the size of the window for surround
M = 1000;                  %is a constant value
out1 = padarray(im,[s1,s1],'symmetric');
out1 = CE(out1,s1,M);
out = (out1(s1+1:end-s1,s1+1:end-s1));
out = normalize8(out,0);   %to set the range of pixel from 0-1



function [out] = CE(out,s,M)
  B = 255;
  out1 = out;
  for i = s+1 : size(out,1) - s
    for j = s+1 : size(out,2) - s
        temp = out(i-s:i+s,j-s:j+s);
        Yij = out1(i,j);
        Sij = (1/(2*s+1)^2)*sum(sum(temp));
          if (Yij>=Sij)
            Aij = A(Yij-Sij,M);
            out1(i,j) = ((B + Aij)*Yij)/(Aij+Yij);
          else
            Aij = A(Sij-Yij,M);
            out1(i,j) = (Aij*Yij)/(Aij+B-Yij);
          end
      end
   end
out = out1;
function [Ax] = A(x,M)
   if x == 0
       Ax = M;
   else 
       Ax = M/x;
    end

代码执行以下操作:
1)将图像标准化为0-255范围,并用附加元素填充以执行加窗操作 2)调用CE函数 3)在函数CE中获得窗口图像(temp) 4)找出窗口的平均值(Sij) 5)将窗口中心(Yij)与平均值(Sij)进行比较 6)根据比较结果执行两个增强操作中的一个 7)最后将范围设置回0-1。

我必须为多个窗口大小(K1,K2,K3等)运行它,图像的大小为1728 * 2034。当窗口大小选择为100时,消耗的时间非常长。

我可以在某个阶段使用矢量化来减少循环时间吗?

分析器结果(窗口大小为21)如下: Profiler timing 21*21 window

分析器结果(窗口大小为100)如下: Profiler timing 100*100 window

我已经更改了我的函数代码并且没有子函数编写了它。代码如下:

function [out] = CE(out,s,M)
B = 255;
Aij = zeros(1,2);
out1 = out;
n_factor = (1/(2*s+1)^2);
for i = s+1 : size(out,1) - s
    for j = s+1 : size(out,2) - s
        temp = out(i-s:i+s,j-s:j+s);
        Yij = out1(i,j);
        Sij = n_factor*sum(sum(temp));
        if Yij-Sij == 0
            Aij(1) = M;
            Aij(2) = M;
        else
            Aij(1) = M/(Yij-Sij);
            Aij(2) = M/(Sij-Yij);
        end
        if (Yij>=Sij)
            out1(i,j) = ((B + Aij(1))*Yij)/(Aij(1)+Yij);
        else
            out1(i,j) = (Aij(2)*Yij)/(Aij(2)+B-Yij);
        end
    end
end
out = out1;

速度从93秒略微提高到88秒。欢迎对我的代码进行任何其他改进的建议。

我试图将给出的建议与用卷积替换滑动窗口,然后对其余部分进行矢量化。下面的代码是我的实现,我没有得到预期的结果。

function [out_im] = CE_conv(im,s,M)
B = 255;
temp = ones(2*s,2*s);
temp = temp ./ numel(temp);
out1 = conv2(im,temp,'same');
out_im = im;
Aij = im-out1;                             %same as Yij-Sij
Aij1 = out1-im;                            %same as Sij-Yij
Mij = Aij;
Mij(Aij>0) = M./Aij(Aij>0);                % if Yij>Sij  Mij = M/Yij-Sij;
Mij(Aij<0) = M./Aij1(Aij<0);               % if Yij<Sij  Mij = M/Sij-Yij;
Mij(Aij==0) = M;                           % if Yij-Sij == 0 Mij = M;
out_im(Aij>=0) = ((B + Mij(Aij>=0)).*im(Aij>=0))./(Mij(Aij>=0)+im(Aij>=0));
out_im(Aij<0) = (Mij(Aij<0).*im(Aij<0))./ (Mij(Aij<0)+B-im(Aij<0));

我无法弄清楚我哪里出错了。

以下文件详细解释了我试图实施的内容:
Vonikakis,Vassilios和Ioannis Andreadis。 &#34;多尺度图像对比度增强。&#34;在控制,自动化,机器人和视觉方面,2008年.ICARCV 2008.第10届国际会议,第856-861页。 IEEE,2008。

2 个答案:

答案 0 :(得分:1)

我已经尝试通过使用colfiltnlfilter进行处理来查看是否可以减少这些时间,因为两者通常比滑动窗口图像处理的for循环快得多。 两者都适用于相对较小的窗户。对于2048x2048像素的图像和10x10的窗口,colfilt的解决方案大约需要5秒钟(在我的个人计算机上)。在21x21的窗口中,时间跳跃到27秒,但这仍然是问题上显示的时间的相对改善。不幸的是,我没有足够的内存来使用100x100的窗口进行colfilt,但使用nlfilter的解决方案可以工作,但需要大约120秒。

这里是代码

使用colfilt解决方案:

function outval = enhancematrix(inputmatrix,M,B)
%Inputmatrix is a 2D matrix or column vector, outval is a 1D row vector.

% If inputmatrix is made of integers...
inputmatrix = double(inputmatrix);

%1. Compute S and Y
normFactor = 1 / (size(inputmatrix,1) + 1).^2; %Size of column.
S = normFactor*sum(inputmatrix,1); % Sum over the columns. 
Y = inputmatrix(ceil(size(inputmatrix,1)/2),:); % Center row.
% So far we have all S and Y, one value per column.

%2. Compute A(abs(Y-S)) 
A = Afunc(abs(S-Y),M);
% And all A: one value per column.

%3. The tricky part. If Y(i)-S(i) > 0 do something.
doPositive = (Y > S);
doNegative = ~doPositive;

outval = zeros(1,size(inputmatrix,2));

outval(doPositive) = (B + A(doPositive) .* Y(doPositive)) ./ (A(doPositive) + Y(doPositive));
outval(doNegative) = (A(doNegative) .* Y(doNegative)) ./ (A(doNegative) + B - Y(doNegative));

end

function out = Afunc(x,M)
% Input x is a row vector. Output is another row vector.
    out = x;
    out(x == 0) =  M;
    out(x ~= 0) = M./x(x ~= 0);
end
  

要打电话,只需:

M = 1000; B = 255; enhancenow = @(x) enhancematrix(x,M,B);
w = 21 % windowsize
result = colfilt(inputImage,[w w],'sliding',enhancenow);

使用nlfilter解决方案:

function outval = enhanceimagecontrast(neighbourhood,M,B)

%1. Compute S and Y
normFactor = 1 / (length(neighbourhood) + 1).^2;
S = normFactor*sum(neighbourhood(:));
Y = neighbourhood(ceil(size(neighbourhood,1)/2),ceil(size(neighbourhood,2)/2));


%2. Compute A(abs(Y-S))
test = (Y>=S);
A = Afunc(abs(Y-S),M);

%3. Return outval
if test
    outval = ((B + A) * Y) / (A + Y);
else
    outval = (A * Y) / (A + B - Y);
end


function aval = Afunc(x,M)
if (x == 0)
    aval = M;
else
    aval = M/x;
end
  

要打电话,只需:

M = 1000; B = 255; enhancenow = @(x) enhanceimagecontrast(x,M,B);
w = 21 % windowsize
result = nlfilter(inputImage,[w w], enhancenow);

我没有花太多时间检查一切都是100%正确,但我确实看到了一些不错的对比度增强(头发看起来特别好看)。

答案 1 :(得分:0)

这个答案是Peter建议的实施。我调试了实现并展示了快速实现的最终工作版本。

function [out_im] = CE_conv(im,s,M)
B = 255;
im = ( im - min(im(:)) ) ./ ( max(im(:)) - min(im(:)) )*255;
h = ones(s,s)./(s*s);
out1 = imfilter(im,h,'conv');
out_im = im;
Aij = im-out1;                             %same as Yij-Sij
Aij1 = out1-im;                            %same as Sij-Yij
Mij = Aij;
Mij(Aij>0) = M./Aij(Aij>0);                % if Yij>Sij  Mij = M/(Yij-Sij);
Mij(Aij<0) = M./Aij1(Aij<0);               % if Yij<Sij  Mij = M/(Sij-Yij);
Mij(Aij==0) = M;                           % if Yij-Sij == 0 Mij = M;
out_im(Aij>=0) = ((B + Mij(Aij>=0)).*im(Aij>=0))./(Mij(Aij>=0)+im(Aij>=0));
out_im(Aij<0) = (Mij(Aij<0).*im(Aij<0))./ (Mij(Aij<0)+B-im(Aij<0));
out_im = ( out_im - min(out_im(:)) ) ./ ( max(out_im(:)) - min(out_im(:)) );
  

要调用此方法,请使用以下代码

I = imread('pout.tif');
w_size = 51;
M = 4000;
output = CE_conv(I(:,:,1),w_size,M);
  

'pout.tif'图像的输出如下所示

enter image description here

使用此实现时,Bigger图像和100 * 100块大小的执行时间约为5秒。