我正在尝试从支票图像中提取帐号。我的逻辑是,我试图找到包含帐号的矩形,切割边界矩形,然后将切片输入OCR以从中获取文本。
我面临的问题是当矩形不是很突出并且颜色浅时,由于边缘没有完全连接,我无法获得矩形轮廓。
如何克服这个? 我试过的但没有用的是
牢记以上几点。有人可以帮我解决这个问题吗?
使用的库和版本
scikit-image==0.13.1
opencv-python==3.3.0.10
代码
from skimage.filters import threshold_adaptive, threshold_local
import cv2
第1步:
image = cv2.imread('cropped.png')
第2步:
使用来自skimage的自适应阈值来删除背景,这样我就可以获得帐号矩形框。这适用于矩形更明显的检查,但是当矩形边缘很薄或颜色较浅时,阈值会导致 没有连接的边缘,因此我无法找到轮廓。我在这个问题中进一步附上了这个例子。
account_number_block = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
account_number_block = threshold_adaptive(account_number_block, 251, offset=20)
account_number_block = account_number_block.astype("uint8") * 255
第3步:
稍微侵蚀图像以尝试连接边缘的小断开连接
kernel = np.ones((3,3), np.uint8)
account_number_block = cv2.erode(account_number_block, kernel, iterations=5)
找到轮廓
(_, cnts, _) = cv2.findContours(account_number_block.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
# cnts = sorted(cnts, key=cv2.contourArea)[:3]
rect_cnts = [] # Rectangular contours
for cnt in cnts:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx) == 4:
rect_cnts.append(cnt)
rect_cnts = sorted(rect_cnts, key=cv2.contourArea, reverse=True)[:1]
工作示例
第1步:原始图像
第2步:在阈值处理后删除背景。
步骤3:查找轮廓以查找帐号的矩形框。
失败工作示例 - 浅色矩形边界。
第1步:阅读原始图像
步骤2:阈值处理后删除背景。请注意,矩形的边缘未连接,因此我无法从中获取轮廓。
步骤3:查找轮廓以查找帐号的矩形框。
答案 0 :(得分:4)
import numpy as np
import cv2
import pytesseract as pt
from PIL import Image
#Run Main
if __name__ == "__main__" :
image = cv2.imread("image.jpg", -1)
# resize image to speed up computation
rows,cols,_ = image.shape
image = cv2.resize(image, (np.int32(cols/2),np.int32(rows/2)))
# convert to gray and binarize
gray_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
binary_img = cv2.adaptiveThreshold(gray_img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 9, 9)
# note: erosion and dilation works on white forground
binary_img = cv2.bitwise_not(binary_img)
# dilate the image to fill the gaps
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilated_img = cv2.morphologyEx(binary_img, cv2.MORPH_DILATE, kernel,iterations=2)
# find contours, discard contours which do not belong to a rectangle
(_, cnts, _) = cv2.findContours(dilated_img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
rect_cnts = [] # Rectangular contours
for cnt in cnts:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx) == 4:
rect_cnts.append(cnt)
# sort contours based on area
rect_cnts = sorted(rect_cnts, key=cv2.contourArea, reverse=True)[:1]
# find bounding rectangle of biggest contour
box = cv2.boundingRect(rect_cnts[0])
x,y,w,h = box[:]
# extract rectangle from the original image
newimg = image[y:y+h,x:x+w]
# use 'pytesseract' to get the text in the new image
text = pt.image_to_string(Image.fromarray(newimg))
print(text)
cv2.namedWindow('Image', cv2.WINDOW_NORMAL)
cv2.imshow('Image', newimg)
cv2.waitKey(0)
cv2.destroyAllWindows()
结果: 03541140011724
结果: 34785736216