Question

如果我有一张图像，其中有一页文字在均匀背景上拍摄，我该如何自动检测纸张和背景之间的边界？

我想要检测的图像示例如下所示。我要处理的图像由统一背景上的单页组成，可以任意角度旋转。

enter image description here

Answer 1

一种简单的方法是将图像转换为灰度后将图像阈值设置为某个已知值。这种方法的问题在于我们正在应用全局阈值，因此如果您将阈值设置得太高，图像底部的某些纸张将会丢失。如果你将阈值设置得太低，那么你肯定会得到这篇论文，但你也会包含很多背景像素，并且可能很难通过后期处理去除这些像素。

我可以建议的一件事是使用自适应阈值算法。过去对我有用的算法是Bradley-Roth adaptive thresholding algorithm。您可以在我稍后评论过的帖子中阅读相关内容：

Bradley Adaptive Thresholding -- Confused (questions)

但是，如果你想要它的要点，首先拍摄integral image灰度版本的图像。积分图像很重要，因为它允许您以O(1)复杂度计算窗口内像素的总和。但是，积分图像的计算通常为O(n^2)，但您只需要执行一次。使用积分图像，您可以扫描大小为s x s的像素的邻域，并检查平均强度是否小于此t%窗口内实际平均值的s x s，然后这是像素归类为背景。如果它更大，那么它被归类为前景的一部分。这是自适应的，因为阈值处理是使用局部像素邻域而不是使用全局阈值来完成的。

我已经为您编写了Bradley-Roth算法的实现。算法的默认参数是s是图像宽度的1/8，t是15％。因此，您可以通过这种方式调用它来调用默认参数：

out = adaptiveThreshold(im);

im是输入图像，out是二进制图像，表示属于前景（logical true）或背景（logical false）的内容。您可以使用第二个和第三个输入参数：s是阈值窗口的大小，t我们上面谈到的百分比，可以像这样调用函数：

out = adaptiveThreshold(im, s, t);

因此，算法的代码如下所示：

function [out] = adaptiveThreshold(im, s, t)

%// Error checking of the input
%// Default value for s is 1/8th the width of the image
%// Must make sure that this is a whole number
if nargin <= 1, s = round(size(im,2) / 8); end

%// Default value for t is 15
%// t is used to determine whether the current pixel is t% lower than the
%// average in the particular neighbourhood
if nargin <= 2, t = 15; end

%// Too few or too many arguments?
if nargin == 0, error('Too few arguments'); end
if nargin >= 4, error('Too many arguments'); end

%// Convert to grayscale if necessary then cast to double to ensure no
%// saturation
if size(im, 3) == 3
    im = double(rgb2gray(im));
elseif size(im, 3) == 1
    im = double(im);
else
    error('Incompatible image: Must be a colour or grayscale image');
end

%// Compute integral image
intImage = cumsum(cumsum(im, 2), 1);

%// Define grid of points
[rows, cols] = size(im);
[X,Y] = meshgrid(1:cols, 1:rows);

%// Ensure s is even so that we are able to index the image properly
s = s + mod(s,2);

%// Access the four corners of each neighbourhood
x1 = X - s/2; x2 = X + s/2;
y1 = Y - s/2; y2 = Y + s/2;

%// Ensure no co-ordinates are out of bounds
x1(x1 < 1) = 1;
x2(x2 > cols) = cols;
y1(y1 < 1) = 1;
y2(y2 > rows) = rows;

%// Count how many pixels there are in each neighbourhood
count = (x2 - x1) .* (y2 - y1);

%// Compute row and column co-ordinates to access each corner of the
%// neighbourhood for the integral image
f1_x = x2; f1_y = y2;
f2_x = x2; f2_y = y1 - 1; f2_y(f2_y < 1) = 1;
f3_x = x1 - 1; f3_x(f3_x < 1) = 1; f3_y = y2;
f4_x = f3_x; f4_y = f2_y;

%// Compute 1D linear indices for each of the corners
ind_f1 = sub2ind([rows cols], f1_y, f1_x);
ind_f2 = sub2ind([rows cols], f2_y, f2_x);
ind_f3 = sub2ind([rows cols], f3_y, f3_x);
ind_f4 = sub2ind([rows cols], f4_y, f4_x);

%// Calculate the areas for each of the neighbourhoods
sums = intImage(ind_f1) - intImage(ind_f2) - intImage(ind_f3) + ...
    intImage(ind_f4);

%// Determine whether the summed area surpasses a threshold
%// Set this output to 0 if it doesn't
locs = (im .* count) <= (sums * (100 - t) / 100);
out = true(size(im));
out(locs) = false;

end

当我使用您的图片并设置s = 500和t = 5时，这是代码，这是我得到的图像：

im = imread('http://i.stack.imgur.com/MEcaz.jpg');
out = adaptiveThreshold(im, 500, 5);
imshow(out);

enter image description here

你可以看到图像的底部白色有一些虚假的白色像素，我们需要在纸张内部填充一些孔。因此，让我们使用一些形态并声明一个15 x 15平方的结构元素，执行一个开口来消除噪声像素，然后在我们完成时填充漏洞：

se = strel('square', 15);
out = imopen(out, se);
out = imfill(out, 'holes');
imshow(out);

这就是我得到的所有内容：

enter image description here

不错呃？现在，如果您真的想要看到纸张分割后的图像，我们可以使用此蒙版并将其与原始图像相乘。这样，任何属于纸张的像素都会保留，而属于背景的像素则会消失：

out_colour = bsxfun(@times, im, uint8(out));
imshow(out_colour);

我们得到了这个：

enter image description here

你必须使用这些参数，直到它适合你，但上面的参数是我用来使它适用于你展示给我们的特定页面的参数。图像处理完全是试验和错误，并按照正确的顺序执行处理步骤，直到您获得足够好的东西为止。

快乐的图像过滤！

从图像中的统一背景中提取页面

1 个答案: