使用OCR处理带有印刷体字母的图像

时间:2019-01-22 03:13:44

标签: opencv tesseract

我想从游戏中对此战斗日志进行OCR:

战斗日志图像

原始图像具有文本字体,因此在阈值设置后,我反转了颜色。现在,我想删除“黑色”背景(而不是黑色文本),但是我不确定如何在OpenCV中实现。然后,我想我想使文本变粗以得到更好的OCR。

请问我该怎么做?

1 个答案:

答案 0 :(得分:3)

尝试一下。

import cv2
import numpy as np

img = cv2.imread("1.png")

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

invert0 = cv2.bitwise_not(gray)

_,thresh = cv2.threshold(invert0,128,255,cv2.THRESH_BINARY)

invert1 = cv2.bitwise_not(thresh)

im2, contours, hierarchy = cv2.findContours(invert1,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

mask = np.zeros(gray.shape, dtype="uint8")
for i in range(len(contours)):
    if(hierarchy[0][i][3]==-1): #contour has no parent (most outer contour)
        cv2.fillPoly(mask, pts =[contours[i]], color=255)

invert2 = cv2.bitwise_not(mask)
res = invert2 + invert1

cv2.imshow("img", img)    
cv2.imshow("gray", gray)   
cv2.imshow("invert0", invert0) 
cv2.imshow("thresh", thresh) 
cv2.imshow("invert1", invert1) 
cv2.imshow("invert2", invert2)
cv2.imshow("mask", mask)
cv2.imshow("res", res)

cv2.waitKey()
cv2.destroyAllWindows() 

enter image description here