Question

大家早上好，

今天我想关注“C ++中的图像处理”这个主题。

到目前为止，我可以过滤掉图片中的所有噪音，并将颜色更改为黑白。

但现在我有两个问题。

第一个问题：
您可以在下面看到该图片的屏幕截图。找出如何旋转文本的最佳方法是什么。最后，如果文本是水平的，那将是很好的。有没有人有一个很好的链接或一个例子。

enter image description here

第二个问题：
怎么继续？你认为我应该将图像发送到“光学字符识别器”（a），还是应该过滤掉每个字母（b）？
如果答案是（a）什么是最小的ocr lib？到目前为止，我发现的所有库都似乎过于强大，很难在现有项目中实现。（如gocr或tesseract）

如果答案是（b）将每个字母保存为自己的图像的最佳方法是什么？ Shoul我搜索一个白色像素，而不是从像素到像素，保存2D数组中的坐标？什么是字母“我”;）

感谢所有帮助我找到路的人！很抱歉上面有奇怪的英语。我仍然是一种语言 noob ： - ）

Answer 1

第一个问题中通常的问题名称是“偏斜修正”

enter image description here

你可以谷歌（很多参考资料）。一篇好文章here，展示了如何获得这个：

enter image description here

一种简单的开始方式（但不如前面提到的那样）是执行Principal Component Analysis：

enter image description here

Answer 2

关于你的第一个问题：

首先，删除任何不属于字母序列的嘈杂白色像素的“规格”。温和的低通滤波器（像素颜色=周围像素的平均值），然后将像素值钳制为纯黑色或纯白色。这应该摆脱图像中“a”字符下面的小“点”和任何其他规格。

现在搜索以下像素：

xMin = white pixel with the lowest  x value (white pixel closest to the left edge)
xMax = white pixel with the largest x value (white pixel closest to the right edge)
yMin = white pixel with the lowest  y value (white pixel closest to the top edge)
yMax = white pixel with the largest y value (white pixel closest to the bottom edge)

with these four pixel values, form a bounding box: Rect(xMin, yMin, xMax, yMax);
compute the area of the bounding box and find the center.

using the center of the bounding box, rotate the box by N degrees. (You can pick N: 1 degree would be an ok value).

Repeat the process of finding xMin,xMax,yMin,yMax and recompute the area

Continue rotating by N degrees until you've rotated K degrees.  Also rotate by -N degrees until you've rotated by -K degrees.  (Where K is the max rotation... say 30 degrees). At each step recompute the area of the bounding box.

产生具有最小面积的边界框的旋转可能是使与底边平行的字母（水平对齐）对齐的旋转。

Answer 3

您可以从底部测量每个白色像素的高度，并找出文本倾斜的程度。这是一个非常简单的方法，但是当我尝试它时，它对我来说很好。

图像处理 - 旋转和光学字符识别

3 个答案: