Question

我接到了一个学校项目，用于识别各种验证码，但在实施过程中遇到了一些困难。

这种类型的图像将被送入输入 ,,。

我使用以下代码处理它们：

import cv2 
import pytesseract 

# load image 
fname = 'picture.png' 
im = cv2.imread(fname,cv2.COLOR_RGB2GRAY) 

pytesseract.pytesseract.tesseract_cmd = r'C:\Tesseract-OCR\tesseract.exe'

im = im[0:90, 35:150]

im = cv2.blur(im,(3,3)) 

im = cv2.threshold(im, 223 , 250, cv2.THRESH_BINARY) 
im = im[1] 

cv2.imshow('',im) 
cv2.waitKey(0)

经过所有处理后，图像看起来是这样的：而此时，我有一个问题，如何将图像修改为计算机可读性良好，而不是错误的{{1} }} 他会显示TAREQ.

我正在尝试使用 7TXB6Q 库显示图像中的文本，如下所示

pytesseract

我写在这里是希望得到宝贵的建议（也许您知道从图片中获取文本或处理上面固定的图像的最合适方法）。祝大家平安）

更多图片

Answer 1

您可以尝试查找计数并消除那些具有小区域的计数。这种预处理操作应该会增加 OCR 结果的成功率。

之前：

import cv2 as cv
import numpy as np

# your thresholded image im
bw = cv.imread('bw.png', cv.IMREAD_GRAYSCALE)

_, cnts, _ = cv.findContours(bw, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
# remove the largest contour which is background
cnts = np.array(cnts[1:], dtype=object)

areas = np.array(list(map(cv.contourArea, cnts)))

thr = 35
thr_cnts = cnts[areas > thr]

disp_img = 255 * np.ones(bw.shape, dtype=np.uint8)
disp_img = cv.drawContours(disp_img, thr_cnts, -1, (0, 0, 0), cv.FILLED)
disp_img = cv.bitwise_or(disp_img, bw)

cv.imshow('result', disp_img)
cv.waitKey()
cv.destroyAllWindows()

cv.imwrite('result.png', disp_img)

结果：

编辑：似乎合并两个代码并没有给出相同的结果。这是从头到尾的完整代码。

输入：

import cv2 as cv
import numpy as np

# load image 
fname = 'im.png'
im = cv.imread(fname, cv.IMREAD_GRAYSCALE)

# crop
im = im[0:90, 35:150]

# blurring is essential for denoising
im = cv.blur(im, (3,3))

thr = 219
# the binary threshold value is very important
# using 220 instead of 219 causes loss of a letter
# because it touches to the bottom edge and gets involved in the background
_, im = cv.threshold(im, thr, 255, cv.THRESH_BINARY)

cv.imshow('', im)
cv.waitKey(0)

阈值：

# binary image
bw = np.copy(im)

# find contours and corresponding areas
_, cnts, _ = cv.findContours(bw, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)
cnts = np.array(cnts, dtype=object)
areas = np.array(list(map(cv.contourArea, cnts)))

thr = 35
# eliminate contours that are smaller than threshold
# also remove the largest contour which is background
thr_cnts = cnts[np.logical_and(areas > thr, areas != np.max(areas))]

# draw the remaining contours
disp_img = 255 * np.ones(bw.shape, dtype=np.uint8)
disp_img = cv.drawContours(disp_img, thr_cnts, -1, (0, 0, 0), cv.FILLED)
disp_img = cv.bitwise_or(disp_img, bw)

cv.imshow('', disp_img)
cv.waitKey()
cv.destroyAllWindows()

结果：

我在识别图片中的文本时遇到问题，python

1 个答案: