我正在尝试从屏幕截图中读取相对清晰的数字,但是我遇到了使pytesseract正确读取文本的问题。我有以下屏幕截图:
我知道比分(2-0)和时钟(1:42)将在完全相同的位置。
这是我目前用于读取时钟时间和橙色分数的代码:
// main should be
int main()
{
int x = 1, y = 1;
track_machine(x, y);
}
// function decoration should be
void track_machine(int x, int y) {...}
// and your recursive call line should read as
if (op == 'R')
track_machine(x, y);
这是输出:
lower_orange = np.array([0, 90, 200], dtype = "uint8")
upper_orange = np.array([70, 160, 255], dtype = "uint8")
#Isolate scoreboard location on a 1080p pic
clock = input[70:120, 920:1000]
scoreboard = input[70:150, 800:1120]
#greyscale
roi_gray = cv2.cvtColor(clock, cv2.COLOR_BGR2GRAY)
config = ("-l eng -c tessedit_char_whitelist=0123456789: --oem 1 --psm 8")
time = pytesseract.image_to_string(roi_gray, config=config)
print("time is " + time)
# find the colors within the specified boundaries and apply
# the mask
mask_orange = cv2.inRange(scoreboard, lower_orange, upper_orange)
# find contours in the thresholded image, then initialize the
# list of digit locations
cnts = cv2.findContours(mask_orange.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
locs = []
for (i, c) in enumerate(cnts):
# compute the bounding box of the contour, then use the
# bounding box coordinates to derive the aspect ratio
(x, y, w, h) = cv2.boundingRect(c)
ar = w / float(h)
# since score will be a fixed size of about 25 x 35, we'll set the area at about 300 to be safe
if w*h > 300:
orange_score_img = mask_orange[y-5:y+h+5, x-5:x+w+5]
orange_score_img = cv2.GaussianBlur(orange_score_img, (5, 5), 0)
config = ("-l eng -c tessedit_char_whitelist=012345 --oem 1 --psm 10")
orange_score = pytesseract.image_to_string(orange_score_img, config=config)
print("orange_score is " + orange_score)
这是orange_score_img,在我屏蔽了上下橙色边界内的所有内容并应用了高斯模糊之后。
到目前为止,即使我将pytesseract配置为搜索1个字符并限制了白名单,我仍然无法正确读取它。我是否缺少其他一些后处理方法来帮助pytesseract将此数字读取为2?
答案 0 :(得分:1)
根据@ fmw42的建议,我尝试进行一些形态更改。增加数字似乎可以解决问题!
kernel = np.ones((5,5),np.uint8)
orange_score_img = cv2.dilate(orange_score_img,kernel,iterations=1)
编辑:我意识到,真正的答案是pytesseract在白色背景上的黑色文本比在黑色背景上的白色文本要好得多!当我反转颜色时,它的读起来很完美:
orange_score_img = cv2.bitwise_not(orange_score_img)
我希望这对初次使用pytesseract的人们有所帮助!尝试调整图像以适合我的所有情况非常令人沮丧,并且知道白色的黑色文本效果更好会节省我几个小时...