Question

代码应使用OpenCV检测字母和数字。问题在于它无法检测到包含两部分的字母，例如i，j或阿拉伯字母ب，ت，ث，ج，خ，ض等。

这是我的代码：

image = cv2.imread('output.png')

height, width, depth = image.shape

# resizing the image to find spaces better
image = cv2.resize(image, dsize=(width * 5, height * 4), interpolation=cv2.INTER_CUBIC)
# grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# binary
ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV)

# dilation
kernel = np.ones((5, 5), np.uint8)
img_dilation = cv2.dilate(thresh, kernel, iterations=1)

# adding GaussianBlur
gsblur = cv2.GaussianBlur(img_dilation, (5, 5), 0)

# find contours
ctrs, hier = cv2.findContours(gsblur.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

m = list()
# sort contours
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])
pchl = list()
dp = image.copy()
for i, ctr in enumerate(sorted_ctrs):
    # Get bounding box
    x, y, w, h = cv2.boundingRect(ctr)
    cv2.rectangle(dp, (x - 10, y - 10), (x + w + 10, y + h + 10), (90, 0, 255), 9)

要检测具有多个部分的形状，我需要更改什么？

Answer 1

行内联光学字符识别不是一件容易的事，不是真正从单个字符识别的角度来看，而是由于单个字符的正确分割。

在此领域中，一种流行的方法是识别每个conex项目，将其隔离，并建立某种关联算法。基本上，您将有一堆字符部分，一些是完整的字符，一些不是。对于这些不完整的项目（不能单独分类为字符），请检查该项目的邻域，以围绕其周围的项目构建字符（可以是左键，也可以是右键，甚至有些高或低）。分类器的反馈对于细分任务至关重要。

您会在文献中找到这种方法，其名称为：对细分的含蓄或基于识别的细分。 this和this论文中有更多详细信息，而this文章中有关阿拉伯字符的更多信息。

检测不是OpenCV中的一个连接组件的形状

1 个答案: