Question

我有这张图片。

使用python 3.8运行Pytesseract会产生以下问题：

“电话”一词读作O（不是零，在奥斯卡语中是O）
“传真”一词为2％。
电话号码显示为（56031770

所考虑的图像不包含这些框。这些框是在对检测到的文本区域/单词应用框后从cv2输出中获取的。

读取传真号码没有问题。（866）357-7704（包括括号和连字符）

图像大小为23兆像素（从pdf文件转换）映像已在opencv中预先设置了阈值，以便获得二进制映像图像不包含粗体。所以我没有腐蚀。

我该怎么做才能正确读取电话号码？谢谢。

PS：我使用的是image_to_data（而不是image_to_text），因为我还需要知道字符串在页面上的位置。

编辑：这是代码的相关部分：

from PIL import Image
import pytesseract
from pytesseract import Output
import argparse
import cv2
import os
import numpy as np
import math
from pdf2image import convert_from_path 
from scipy.signal import convolve2d
import string

filename = "image.png"
image = cv2.imread(filename)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# estimate noise on image
H, W = gray.shape
M = [[1, -2, 1],
    [-2, 4, -2],
    [1, -2, 1]]

sigma = np.sum(np.sum(np.absolute(convolve2d(gray, M))))
sigma = sigma * math.sqrt(0.5 * math.pi) / (6 * (W-2) * (H-2))

# if image has too much noise then go with blurring method

if sigma > 10 :
    # noisy
    gray = cv2.medianBlur(gray, 3)
    print("noises deblurred")
# otherwise go with threshholding method
else :
    gray = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    print("threshhold applied")


d = pytesseract.image_to_data(gray, output_type=Output.DICT)
for t in d['text'] :
    print(t)

因此，这将是psm 3（默认值）

版本：

Tesseract：tesseract 4.1.1（以tesseract --version检索）＆ pytessract：版本：0.3.2（使用pip3 show pytesseract检索）

Pytesseract图像OCR-无法识别数字

0 个答案: