我在识别图片中的文本时遇到问题,python

时间:2020-12-29 21:27:50

标签: python opencv scikit-image python-tesseract

我接到了一个学校项目,用于识别各种验证码,但在实施过程中遇到了一些困难。

这种类型的图像将被送入输入 enter image description here,enter image description here,enter image description here

我使用以下代码处理它们:

import cv2 
import pytesseract 

# load image 
fname = 'picture.png' 
im = cv2.imread(fname,cv2.COLOR_RGB2GRAY) 

pytesseract.pytesseract.tesseract_cmd = r'C:\Tesseract-OCR\tesseract.exe'

im = im[0:90, 35:150]

im = cv2.blur(im,(3,3)) 

im = cv2.threshold(im, 223 , 250, cv2.THRESH_BINARY) 
im = im[1] 

cv2.imshow('',im) 
cv2.waitKey(0) 

经过所有处理后,图像看起来是这样的:enter image description here 而此时,我有一个问题,如何将图像修改为计算机可读性良好,而不是错误的{{1} }} 他会显示TAREQ.

我正在尝试使用 7TXB6Q 库显示图像中的文本,如下所示

pytesseract

我写在这里是希望得到宝贵的建议(也许您知道从图片中获取文本或处理上面固定的图像的最合适方法)。祝大家平安)


更多图片

enter image description here enter image description here enter image description here enter image description here

1 个答案:

答案 0 :(得分:1)

您可以尝试查找计数并消除那些具有小区域的计数。这种预处理操作应该会增加 OCR 结果的成功率。

之前:before

import cv2 as cv
import numpy as np

# your thresholded image im
bw = cv.imread('bw.png', cv.IMREAD_GRAYSCALE)

_, cnts, _ = cv.findContours(bw, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
# remove the largest contour which is background
cnts = np.array(cnts[1:], dtype=object)

areas = np.array(list(map(cv.contourArea, cnts)))

thr = 35
thr_cnts = cnts[areas > thr]

disp_img = 255 * np.ones(bw.shape, dtype=np.uint8)
disp_img = cv.drawContours(disp_img, thr_cnts, -1, (0, 0, 0), cv.FILLED)
disp_img = cv.bitwise_or(disp_img, bw)

cv.imshow('result', disp_img)
cv.waitKey()
cv.destroyAllWindows()

cv.imwrite('result.png', disp_img)

结果:result


编辑:似乎合并两个代码并没有给出相同的结果。这是从头到尾的完整代码。

输入:CAPTCHA

import cv2 as cv
import numpy as np

# load image 
fname = 'im.png'
im = cv.imread(fname, cv.IMREAD_GRAYSCALE)

# crop
im = im[0:90, 35:150]

# blurring is essential for denoising
im = cv.blur(im, (3,3))

thr = 219
# the binary threshold value is very important
# using 220 instead of 219 causes loss of a letter
# because it touches to the bottom edge and gets involved in the background
_, im = cv.threshold(im, thr, 255, cv.THRESH_BINARY)

cv.imshow('', im)
cv.waitKey(0)

阈值:threshold

# binary image
bw = np.copy(im)

# find contours and corresponding areas
_, cnts, _ = cv.findContours(bw, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)
cnts = np.array(cnts, dtype=object)
areas = np.array(list(map(cv.contourArea, cnts)))

thr = 35
# eliminate contours that are smaller than threshold
# also remove the largest contour which is background
thr_cnts = cnts[np.logical_and(areas > thr, areas != np.max(areas))]

# draw the remaining contours
disp_img = 255 * np.ones(bw.shape, dtype=np.uint8)
disp_img = cv.drawContours(disp_img, thr_cnts, -1, (0, 0, 0), cv.FILLED)
disp_img = cv.bitwise_or(disp_img, bw)

cv.imshow('', disp_img)
cv.waitKey()
cv.destroyAllWindows()

结果:result