Question

我正在尝试使用tesseract ocr将图像转换为文本。图像始终有三个字母，没有旋转/倾斜，但随机分布在90x50 png文件中。

仅清洁并转换为黑白，tesseract无法获得图像中的文本。在Paint中手动对齐它们之后，ocr给出确切的匹配。我什至不需要完全对齐。我想要的是一些有关如何在将图像发送到tesseract之前自动进行图像中字符对齐的提示。

我正在将python与tesseract和opencv一起使用。

原始图片：

我所做的-变成黑白：

我想做什么-按代码对齐：

Answer 1

您可以使用以下代码来实现此输出。某些常量可能需要更改以适合您的需求：

import cv2
import numpy as np

# Read the image (resize so it is easier to see)
img = cv2.imread("/home/stephen/Desktop/letters.png",0)
h,w = img.shape
img = cv2.resize(img, (w*5,h*5))
# Threshold the image and find the contours
_, thresh = cv2.threshold(img, 123, 255, cv2.THRESH_BINARY_INV);
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

# Create a white background iamge to paste the letters on
bg = np.zeros((200,200), np.uint8)
bg[:] = 255
left = 5

# Iterate through the contours
for contour,h in zip(contours, hierarchy[0]):
    # Ignore inside parts (circle in a 'p' or 'b')
    if h[3] == -1:
        # Get the bounding rectangle
        x,y,w,h = cv2.boundingRect(contour)
        # Paste it onto the background
        bg[5:5+h,left:left+w] = img[y:y+h,x:x+w]
        left += (w + 5)
cv2.imshow('thresh', bg)
cv2.waitKey()

如何处理二进制图像以连续对齐稀疏字母？

1 个答案: