我试图编写自己的使用线性插值的图像旋转功能(参见下面的代码)。在示例256x256图像上运行我的代码大约需要8秒,或每像素约0.12ms。在同一图像上使用双线性插值运行Matlab的imrotate函数大约需要0.2秒,或每像素约0.003ms - 大约提高一百倍。
我猜测有一些我错过的矢量化优化,但我无法弄清楚在哪里。非常感谢任何建议。
以下代码;
function [ output ] = rot_input_img_by_angle( input_img, angle )
%rot_input_img_by_angle Rotates the given image by angle about position
% Given an image in the format [y, x, c], rotates it by the given angle
% around the centre of the image
if(nargin < 2)
error('input_img and angle parameters are both required');
end
if(angle == 0)
output = input_img;
return;
end
position = [0 0];
[height, width, channels] = size(input_img);
num_pixels = height * width;
half_width = width/2 - 0.5;
half_height = height/2 - 0.5;
% Compute the translation vector to move from a top-left origin to a
% centred-origin
T = [-half_width half_height]';
% A lambda function for creating a 2D rotation matrix
rotmat = @(th) [cos(th) -sin(th); sin(th) cos(th)];
% Convert angle to radians and generate rotation matrix R for CR
% rotation
R = rotmat(deg2rad(angle));
output = zeros(height, width, channels);
for y=1:height
for x=1:width
loc = [x-1 y-1]';
% Transform the current pixel location into the
% origin-at-centre coordinate frame
loc = loc .* [1; -1] + T;
% Apply the inverse rotation mapping to this ouput pixel to
% determine the location in the original input_img that this pixel
% corresponds to
loc = R * loc;
% Transform back from the origin-at-centre coordinate frame to
% the original input_img's origin-at-top-left frame
loc = (loc - T) .* [1; -1] + [1; 1];
if((loc(1) < 1) || (loc(1) > width) || (loc(2) < 1) || (loc(2) > height))
% This pixel falls outside the input_img - leave it at 0
continue;
end
% Linearly interpolate the nearest 4 pixels
left_x = floor(loc(1));
right_x = ceil(loc(1));
top_y = floor(loc(2));
bot_y = ceil(loc(2));
if((left_x == right_x) & (top_y == bot_y))
% The sample pixel lies directly on an original input_img pixel
output(y, x, :) = input_img(y, x, :);
else
% The sample pixel lies inbetween several pixels
% Location of the nearest 4 pixels
px_locs = [left_x right_x left_x right_x; top_y top_y bot_y bot_y];
px_dists = distance(loc, px_locs);
px_dists = px_dists ./ sum(px_dists);
% Take the linearly interpolated average of each color
% channel's value
for c=1:channels
output(y, x, c) = ...
px_dists(1) * input_img(px_locs(1, 1), px_locs(2, 1), c) + ...
px_dists(2) * input_img(px_locs(1, 2), px_locs(2, 2), c) + ...
px_dists(3) * input_img(px_locs(1, 3), px_locs(2, 3), c) + ...
px_dists(4) * input_img(px_locs(1, 4), px_locs(2, 4), c);
end
end
end
end
output = cast(output, class(input_img));
end
答案 0 :(得分:3)
您可以通过使用
键入来查看matlab使用的功能edit imrotate
此外,文档说:
% Performance Note
% ----------------
% This function may take advantage of hardware optimization for datatypes
% uint8, uint16, and single to run faster.
在这种情况下,Matlab调用imrotatemex,即编译为从Matlab调用的C代码,通常更快。我不知道你的形象和系统,所以如果发生这种情况我就不能说了。
您仍然可以通过矢量化来显着加速代码。不是循环遍历图像中的每个x和y值,而是使用meshgrid构建包含x和y的所有组合的数组,并将操作应用于数组。这个SO问题包含matlab中最近邻插值旋转的实现,它是矢量化的:
答案 1 :(得分:0)
我认为当MATLAB使用英特尔的IPP库时会发生神奇的事情: